home

SNE Master Research Projects 2020 - 2021

http://uva.nl/
2004-
2005
2005-
2006
2006-
2007
2007-
2008
2008-
2009
2009-
2010
2010-
2011
2011-
2012
2012-
2013
2013-
2014
2014-
2015
2015-
2016
2016-
2017
2017-
2018
2018-
2019
2019-
2020
2020-
2021
2021-
2022
Contact TimeLine Projects LeftOver Projects Presentations-rp1 Presentations-rp2 Objective Process Tips Project Proposal

Contact

Cees de Laat, room: C.3.152
Course Codes:

Research Project 1 53841REP6Y
Research Project 2 53842REP6Y

TimeLine


RP1 (January):
  • Wednesday Sept 16, 10h00-11h30: Introduction to the Research Projects.
  • Wednesday Nov xx, xxh00-xxh00: Detailed discussion on selections for RP1.
  • Monday Jan 4th - Friday Jan 29th 2021: Research Project 1.
  • Friday Jan 8th: (updated) research plan due.
  • Monday Feb 1, 10h00-17h00: Presentations RP1 in B1.23 at SP 904.
  • Tuesday Feb 2, 10h00 - 17h00: Presentations RP1 in B1.23 at SP 904.
  • Sunday Feb 6, 24h00: RP - reports due
RP2 (June):
  • Wednesday May xx, ,10h00-16h00, Zoom, Detailed discussion on chosen subjects for RP2.
  • Monday Jun xth - Friday Jun 25: Research Project 2.
  • Friday Jun 5th: (updated) research plan due.
  • Thursday Jul 2, 10h00-17h00: presentations.
  • Friday Jul 3, 10h00-17h00: presentations.
  • Sunday Jul 5, 24h00: RP - reports due

Projects

Here is a list of student projects. Find here the left over projects this year: LeftOvers.
In a futile attempt to prevent spam "@" is replaced by "=>" in the table.
Color of cell background:
Project available Presentation received. Confidentiality was requested.
Currently chosen project. Report received. Blocked, not available.
Project plan received. Completed project. Report but no presentation
Outside normal rp timeframe project will be done in next block

wordle-s.png


title
summary
supervisor contact

students
R

P
1
/
2
1

Blockchain's Relationship with Sovrin for Digital Self-Sovereign Identities.

Summary: Sovrin (sorvin.org) is a blockchain for self-sovereign identities. TNO operates one of the nodes of the Sovrin network. Sovrin enables easy exchange and verification of identity information (e.g. “age=18+”) for business transactions. Potential savings are estimated to be over 1 B€ per year for just the Netherlands. However, Sovrin provides only an underlying infrastructure. Additional query-response protocols are needed. This is being studied in e.g. the Techruption Self-Sovereign-Identity-Framework (SSIF); project. The research question is which functionalities are needed in the protocols for this. The work includes the development of a datamodel, as well as an implementation that connects to the Sovrin network.
(2018-05)
Oskar van Deventer <oskar.vandeventer=>tno.nl>




2

Sensor data streaming framework for Unity.

In order to build a Virtual Reality “digital twin” of an existing technical framework (like a smart factory), the static 3D representation needs to “play” sensor data which either is directly connected or comes from a stored snapshot. Although a specific implementation of this already exists, the student is asked to build a more generic framework for this, which is also able to “play” position data of parts of the infrastructure (for example moving robots). This will enable the research on virtually working on a digital twin factory.
Research question:
  • What are the requirements and limitations of a seamless integration of smart factory sensor data for a digital twin scenario?
There are existing network capabilities of Unity, existing connectors from Unity to ROS (robot operation system) for sensor data transmission and an existing 3D model which uses position data.
The student is asked to:
  • Build a generic infrastructure which can either play live data or snapshot data.
  • The sensor data will include position data, but also other properties which are displayed in graphs and should be visualized by 2D plots within Unity.
The software framework will be published under an open source license after the end of the project.
Doris Aschenbrenner <d.aschenbrenner=>tudelft.nl>



3

To optimize or not: on the impact of architectural optimizations on network performance.

Project description: Networks are becoming extremely fast. On our testbed with 100Gbps network cards, we can send up to 150 millions of packets per second with under 1us of latency. To support such speeds, many microarchitectural optimizations such as the use of huge pages and direct cache placement of network packets need to be in effect. Unfortunately, these optimizations if not done carefully can significantly harm performance or security. While the security aspects are becoming clear [1], the end-to-end performance impacts remain unknown. In this project, you will investigate the performance impacts of using huge pages and last level cache management in high-performance networking environments. If you were always wondering what happens when receiving millions of packets at nanosecond scale, this project is for you!

Requirements: C programming, knowledge of computer architecture and operating systems internals.

[1] NetCAT: Practical Cache Attacks from the Network, Security and Privacy 2020.
Animesh Trivedi <(animesh.trivedi=>vu.nl>
Kaveh Razavi <kaveh=>cs.vu.nl>


4

The other faces of RDMA virtualization.

Project description: RDMA is a technology that enabled very efficient transfer of data over the network. With 100Gbps RDMA-enabled network cards, it is possible to send hundreds of millions of messages with under 1us latency. Traditionally RDMA has mostly been used in single-user setups in HPC environments. However, recently RDMA technology has been commoditized and used in general purpose workloads such as key-value stores and transaction processing. Major data centers such as Microsoft Azure are already using this technology in their backend services. It is not surprising that there is now support for RDMA virtualization to make it available to virtual machines. We would like you to investigate the limitations of this new technology in terms of isolation and quality of service between different tenants.

Requirements: C programming, knowledge of computer architecture and operating systems internals.

Supervisors: Animesh Trivedi and Kaveh Razavi, VU Amsterdam
Animesh Trivedi <(animesh.trivedi=>vu.nl>
Kaveh Razavi <kaveh=>cs.vu.nl>



5

Verification of Objection Location Data through Picture Data Mining Techniques.

Shadows in the open give out more information about the location of the objects in the pictures. According to the positioning, length, and reflection side of the shadow, verification of location information found in the meta data of a picture can be verified. The objective of this project is to develop such algorithms that find freely available images on the internet where tempering with the location data has been performed. The deliverable from this project are the location verification algorithms, a live web service that verifies the location information of the object, and a non-public facing database that contains information about images that had the location information in their meta-data, removed or falsely altered.
Junaid Chaudhry <chaudhry=>ieee.org>




6

Designing structured metadata for CVE reports.

Vulnerability reports such as MITRE's CVE are currently free format text, without much structure in them. This makes it hard to machine process reports and automatically extract useful information and combine it with other information sources. With tens of thousands of such reports published each year, it is increasingly hard to keep a holistic overview and see patterns. With our open source Binary Analysis Tool we aim to correlate data with firmware databases.

Your task is to analyse how we can use the information from these reports, what metadata is relevant and propose a useful metadata format for CVE reports. In your research you make an inventory of tools that can be used to convert existing CVE reports with minimal effort.

Armijn Hemel - Tjaldur Software Governance Solutions
Armijn Hemel <armijn=>tjaldur.nl>

7

Artificial Intelligence Assisted carving.

Problem Description:
Carving for data and locating files belonging to Principal can be hard if we only use keywords. This still requires a lot of manual work to create keyword lists, which might not even be sufficient to find what we are looking for.
Goal:
  • Create a simple framework to detect documents of a certain set (or company) within carved data by utilizing machine learning. Closely related to document identification.
The research project below is currently the only open project at our Forensics department rated at MSc level. Of course, if your students have any ideas for a cybersecurity/forensics related project they are always welcome to contact us.
Danny Kielman <danny.kielman=>fox-it.com>
Mattijs Dijkstra <mattijs.dijkstra=>fox-it.com>


8

Usage Control in the Mobile Cloud.

Mobile clouds [1] aim to integrate mobile computing and sensing with rich computational resources offered by cloud back-ends. They are particularly useful in services such as transportation, healthcare and so on when used to collect, process and present data from physical world. In this thesis, we will focus on the usage control, in particular privacy, of the collected data pertinent to mobile clouds. Usage control[2] differs from traditional access control by not only enforcing security requirements on the release of data by also on what happens afterwards. The thesis will involve the following steps:
  • Propose an architecture over cloud for "usage control as a service" (extension of authorization as a service) for the enforcement of usage control policies
  • Implement the architecture (compatible with Openstack[3] and Android) and evaluate its performance.
References
[1] https://en.wikipedia.org/wiki/Mobile_cloud_computing
[2] Jaehong Park, Ravi S. Sandhu: The UCONABC usage control model. ACM Trans. Inf. Syst. Secur. 7(1): 128-174 (2004)
[3] https://en.wikipedia.org/wiki/OpenStack
[4] Slim Trabelsi, Jakub Sendor: "Sticky policies for data control in the cloud" PST 2012: 75-80
Fatih Turkmen <F.Turkmen=>uva.nl>
Yuri Demchenko <y.demchenko=>uva.nl>


9

Security of embedded technology.

Analyzing the security of embedded technology, which operates in an ever changing environment, is Riscure's primary business. Therefore, research and development (R&D) is of utmost importance for Riscure to stay relevant. The R&D conducted at Riscure focuses on four domains: software, hardware, fault injection and side-channel analysis. Potential SNE Master projects can be shaped around the topics of any of these fields. We would like to invite interested students to discuss a potential Research Project at Riscure in any of the mentioned fields. Projects will be shaped according to the requirements of the SNE Master.
;
Please have a look at our website for more information: https://www.riscure.com
;
Previous Research Projects conducted by SNE students:
  1. https://www.os3.nl/_media/2013-2014/courses/rp1/p67_report.pdf
  2. https://www.os3.nl/_media/2011-2012/courses/rp2/p61_report.pdf
  3. http://rp.delaat.net/2014-2015/p48/report.pdf
  4. https://www.os3.nl/_media/2011-2012/courses/rp2/p19_report.pdf
If you want to see what the atmosphere is at Riscure, please have a look at: https://vimeo.com/78065043
Please let us know If you have any additional questions!
Ronan Loftus <loftus=>riscure.com>
Alexandru Geana <Geana=>riscure.com>
Karolina Mrozek >Mrozek=>riscure.com>
Dana Geist <geist=>riscure.com>



10

Video broadcasting manipulation detection.

The detection of manipulation of broadcasting videostreams with facial morphing on the internet . Examples are provided from https://dl.acm.org/citation.cfm?id=2818122 and other on line sources.
Zeno Geradts <Z.J.M.H.Geradts=>uva.nl>

11

Cross-blockchain oracle.

Interconnection between different blockchain instances, and smart contracts residing on those, will be essential for a thriving multi-blockchain business ecosystem. Technologies like hashed timelock contracts (HTLC) enable atomic swaps of cryptocurrencies and tokens between blockchains. A next challenge is the cross-blockchain oracle, where the status of an oracle value on one blockchain enables or prevents a transaction on another blockchain.
The goal of this research project is to explore the possibilities, impossibilities, trust assumptions, security and options for a cross-blockchain oracle, as well as to provide a minimal viable implementation.
(2018-05)
Oskar van Deventer <oskar.vandeventer=>tno.nl>
Maarten Everts <maarten.everts=>tno.nl>


12

Man vs the Machine.

Machine Learning has advanced to the point where our computer systems can detect malicious activity through baselining of large volumes of data and picking out the anomalies and non-conformities.; As an example, the finance sector has been using machine learning to detect fraudulent transactions and has been very successful at minimizing the impact of stolen credit card numbers over the past few years.;
  • As we further leverage machine learning and other advance analytics to improve cyber security detection in other industries, what does the role of a cybersecurity analyst evolve into?
  • What are the strengths of machine learning?;
  • What are its weaknesses?; What activities remain after machine learning?;
  • How and when does AI come into the picture?;
  • What are the key skills needed to still be relevant?;
  • What emerging technologies are contributing to the change?;
  • What do new individuals entering into cyber security focus on?;
  • And what do existing cyber security professionals develop to stay current?;
  • What will the industry look like in 2 years?; 5 years? 10+ years?
Rita Abrantes <Rita.Abrantes=>shell.com>

13

Inventory of smartcard-based healthcare identification solutions in Europe and behond: technology and adoption.

For potential international adoption of Whitebox technology in the future, in particular the technique of patients carrying authorization codes with them to authorize healthcare professionals, we want to make an inventory of the current status of healthcare PKIs and smartcard technology in Europe and if possible also outside Europe.

Many countries have developed health information exchange systems over the last 1-2 decades, most of them without much regard of what other countries are doing, or of international interoperability. However, common to most systems developed today is the development of a (per-country) PKI for credentials, typically smartcards, that are provided to healthcare professionals to allow the health information exchange system to identify these professionals, and to establish their 'role' (or rather: the speciality of a doctor, such as GP, pharmacist, gyneacologist, etc.). We know a few of these smartcard systems, e.g., in Austria and France, but not all of them, and we do not know their degree of adoption.

In this project, we would like students to enquire about and report on the state of the art of healthcare smartcard systems in Europe and possibly outside Europe (e.g., Asia, Russia):
  • what products are rolled out by what companies, backed by what CAs (e.g., governmental, as is the case with the Dutch "UZI" healthcare smartcard)?
  • Is it easy to obtain the relevant CA keys?
  • And what is the adoption rate of these smartcards under GPs, emergency care wards, hospitals, in different countries?
  • What are relevant new developments (e.g., contactless solutions) proposed by major stakeholders or industry players in the market?
Note that this project is probably less technical than usual for an SNE student, although it is technically interesting. For comparison, this project may also be fitting for an MBA student.

For more information, see also (in Dutch): https://whiteboxsystems.nl/sne-projecten/#project-2-onderzoek-adoptie-health-smartcards-in-europa-en-daarbuiten
General introduction
Whitebox Systems is a UvA spin-off company working on a decentralized system for health information exchange. Security and privacy protection are key concerns for the products and standards provided by the company. The main product is the Whitebox, a system owned by doctors (GPs) that is used by the GP to authorize other healthcare professionals so that they - and only they - can retrieve information about a patient when needed. Any data transfer is protected end-to-end; central components and central trust are avoided as much as possible. The system will use a published source model, meaning that although we do not give away copyright, the code can be inspected and validated externally.

The Whitebox is currently transitioning from an authorization model that started with doctor-initiated static connections/authorizations, to a model that includes patient-initiated authorizations. Essentially, patients can use an authorization code (a kind of token) that is generated by the Whitebox, to authorize a healthcare professional at any point of care (e.g., a pharmacist or a hospital). Such a code may become part of a referral letter or a prescription. This transition gives rise to a number of interesting questions, and thus to possible research projects related to the Whitebox design, implementation and use. Two of these projects are described below. If you are interested in these project or have questions about other possibilities, please contact <guido=>whiteboxsystems.nl>.

For a more in-depth description of the projects below (in Dutch), please see https://whiteboxsystems.nl/sne-projecten/
Guido van 't Noordende <g.j.vantnoordende=>uva.nl>


14

Decentralized trust and key management.

Currently, the Whitebox provides a means for doctors (General Practitioner GPs) to establish static trusted connections with parties they know personally. These connections (essentially, authenticated TLS connections with known, validated keys), once established, can subsequently be used by the GP to authorize the party in question to access particular patient information. Examples are static connections to the GP post which takes care of evening/night and weekend shifts, or to a specific pharmacist. In this model, trust management is intuïtive and direct. However, with dynamic authorizations established by patients (see general description above), a question comes up on whether the underlying (trust) connections between the GP practice (i.e., the Whitebox) and the authorized organization (e.g,. hospital or pharmacist) may be re-usable as a 'trusted' connection by the GP in the future.

The basis question is:
  • what is the degree of trust a doctor can place in (trust) relations that are established by this doctor's patients, when they authorize another healthcare professional?
More in general:
  • what degree of trust that can be placed in relations/connections established by a patient, also in view of possible theft of authorization tokens held by patients?
  • What kind of validation methods can exist for a GP to increase or validate a given trust relation implied by an authorization action of a patient?
Perhaps the problem can be raised to a higher level also: can (public) auditing mechanisms -- for example, using block chains -- be used to help establish and validate trust in organizations (technically: keys of such organizations), in systems that implement decentralized trust-base transactions, like the Whitebox system does?

In this project, the student(s) may either implement part of a solution or design, or model the behavior of a system inspired by the decentralized authorization model of the Whitebox.

As an example: reputation based trust management based on decentralized authorization actions by patients of multiple doctors may be an effective way to establish trust in organization keys, over time. Modeling trust networks may be an interesting contribution to understanding the problem at hand, and could thus be an interesting student project in this context.

NB: this project is a rather advanced/involved design and/or modelling project. Students should be confident on their ability to understand and design/model a complex system in the relatively short timeframe provided by an RP2 project -- this project is not for the faint of heart. Once completed, an excellent implementation or evaluation may become the basis for a research paper.

See also (in Dutch): https://whiteboxsystems.nl/sne-projecten/#project-2-ontwerp-van-een-decentraal-vertrouwensmodel
General introduction
Whitebox Systems is a UvA spin-off company working on a decentralized system for health information exchange. Security and privacy protection are key concerns for the products and standards provided by the company. The main product is the Whitebox, a system owned by doctors (GPs) that is used by the GP to authorize other healthcare professionals so that they - and only they - can retrieve information about a patient when needed. Any data transfer is protected end-to-end; central components and central trust are avoided as much as possible. The system will use a published source model, meaning that although we do not give away copyright, the code can be inspected and validated externally.

The Whitebox is currently transitioning from an authorization model that started with doctor-initiated static connections/authorizations, to a model that includes patient-initiated authorizations. Essentially, patients can use an authorization code (a kind of token) that is generated by the Whitebox, to authorize a healthcare professional at any point of care (e.g., a pharmacist or a hospital). Such a code may become part of a referral letter or a prescription. This transition gives rise to a number of interesting questions, and thus to possible research projects related to the Whitebox design, implementation and use. Two of these projects are described below. If you are interested in these project or have questions about other possibilities, please contact <guido=>whiteboxsystems.nl>.

For a more in-depth description of the projects below (in Dutch), please see https://whiteboxsystems.nl/sne-projecten/
Guido van 't Noordende <g.j.vantnoordende=>uva.nl>




15

LDBC Graphalytics.

LDBC Graphalytics, is a mature, industrial-grade benchmark for graph-processing platforms. It consists of six deterministic algorithms, standard datasets, synthetic dataset generators, and reference output, that enable the objective comparison of graph analysis platforms. Its test harness produces deep metrics that quantify multiple kinds of system scalability, such as horizontal/vertical and weak/strong, and of robustness, such as failures and performance variability. The benchmark comes with open-source software for generating data and monitoring performance.

Until recently, graph processing used only common big data infrastructure, that is, with much local and remote memory per core and storage on disk. However, operating separate HPC and big data infrastructures is increasingly more unsustainable. The energy and (human) resource costs far exceed what most organizations can afford. Instead, we see a convergence between big data and HPC infrastructure.
For example, next-generation HPC infrastructure includes more cores and hardware threads than ever-before. This leads to a large search space for application-developers to explore, when adapting their workloads to the platform.

To take a step towards a better understanding of performance for graph processing platforms on next-generation HPC infrastructure, we would like to work together with 3-5 students on the following topics:
  1. How to configure graph processing platforms to efficiently run on many/multi-core devices, such as the Intel Knights Landing, which exhibits configurable and dynamic behavior?
  2. How to evaluate the performance of modern many-core platforms, such as the NVIDIA Tesla?
  3. How to setup a fair, reproducible experiment to compare and benchmark graph-processing platforms?
Alex Uta <a.uta=>vu.nl>
Marc X. Makkes <m.x.makkes=>vu.nl>


16

Network aware performance optimization for Big Data applications using coflows.

Optimizing data transmission is crucial to improve the performance of data intensive applications. In many cases, network traffic control plays a key role in optimising data transmission especially when data volumes are very large. In many cases, data-intensive jobs can be divided into multiple successive computation stages, e.g., in MapReduce type jobs. A computation stage relies on the outputs of the the previous stage and cannot start until all its required inputs are in place. Inter-stage data transfer involves a group of parallel flows, which share the same performance goal such as minimising the flow's completion time.

CoFlow is an application-aware network control model for cluster-based data centric computing. The CoFlow framework is able to schedule the network usage based on the abstract application data flows (called coflows). However, customizing CoFlow for different application patterns, e.g., choosing proper network scheduling strategies, is often difficult, in particular when the high level job scheduling tools have their own optimizing strategies.

The project aims to profile the behavior of CoFlow with different computing platforms, e.g., Hadoop and Spark etc.
  1. Review the existing CoFlow scheduling strategies and related work
  2. Prototyping test applications using; big data platforms (including Apache Hadoop, Spark, Hive, Tez).
  3. Set up coflow test bed (Aalo, Varys etc.) using existing CoFlow installations.
  4. Benchmark the behavior of CoFlow in different application patterns, and characterise the behavior.
Background reading:
  1. CoFlow introduction: http://www2.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-211.pdf
  2. Junchao Wang, Huan Zhouy, Yang Huz, Cees de Laatx and Zhiming Zhao, Deadline-Aware Coflow Scheduling in a DAG, in NetCloud 2017, Hongkong, to appear [upon request]
More info: Junchao Wang, Spiros Koulouzis, Zhiming Zhao
Zhiming Zhao <z.zhao=>uva.nl>

17

Elastic data services for time critical distributed workflows.

Large-scale observations over extended periods of time are necessary for constructing and validating models of the environment. Therefore, it is necessary to provide advanced computational networked infrastructure for transporting large datasets and performing data-intensive processing. Data infrastructures manage the lifecycle of observation data and provide services for users and workflows to discover, subscribe and obtain data for different application purposes. In many cases, applications have high performance requirements, e.g., disaster early warning systems.

This project focuses on data aggregation and processing use-cases from European research infrastructures, and investigates how to optimise infrastructures to meet critical time requirements of data services, in particular for different patterns of data-intensive workflow. The student will use some initial software components [1] developed in the ENVRIPLUS [2] and SWITCH [3] projects, and will:
  1. Model the time constraints for the data services and the characteristics of data access patterns found in given use cases.
  2. Review the state of the art technologies for optimising virtual infrastructures.
  3. Propose and prototype an elastic data service solution based on a number of selected workflow patterns.
  4. Evaluate the results using a use case provided by an environmental research infrastructure.
Reference:
  1. https://staff.fnwi.uva.nl/z.zhao/software/drip/
  2. http://www.envriplus.eu
  3. http://www.switchproject.eu
More info: —Spiros Koulouzis, Paul Martin, Zhiming Zhao
Zhiming Zhao <z.zhao=>uva.nl>

18

Contextual information capture and analysis in data provenance.

Tracking the history of events and the evolution of data plays a crucial role in data-centric applications for ensuring reproducibility of results, diagnosing faults, and performing optimisation of data-flow. Data provenance systems [1] are a typical solution, capturing and recording the events generated in the course of a process workflow using contextual metadata, and providing querying and visualisation tools for use in analysing such events later.

Conceptual models such as W3C PROV (and extensions such as ProvONE), OPM and CERIF have been proposed to describe data provenance, and a number of different solutions have been developed. Choosing a suitable provenance solution for a given workflow system or data infrastructure requires consideration of not only the high-level workflow or data pipeline, but also performance issues such as the overhead of event capture and the volume of provenance data generated.

The project will be conducted in the context of EU H2020 ENVRIPLUS project [1, 2]. The goal of this project is to provide practical guidelines for choosing provenance solutions. This entails:
  1. Reviewing the state of the art for provenance systems.
  2. Prototyping sample workflows that demonstrate selected provenance models.
  3. Benchmarking the results of sample workflows, and defining guidelines for choosing between different provenance solutions (considering metadata, logging, analytics, etc.).
References:
  1. About project: http://www.envriplus.eu
  2. Provenance background in ENVRIPLUS: https://surfdrive.surf.nl/files/index.php/s/uRa1AdyURMtYxbb
  3. Michael Gerhards, Volker Sander, Torsten Matzerath, Adam Belloum, Dmitry Vasunin, and Ammar Benabdelkader. 2011. Provenance opportunities for WS-VLAM: an exploration of an e-science and an e-business approach. In Proceedings of the 6th workshop on Workflows in support of large-scale science (WORKS '11). http://dx.doi.org/10.1145/2110497.2110505
More info: - Zhiming Zhao, Adam Belloum, Paul Martin
Zhiming Zhao <z.zhao=>uva.nl>

19

Profiling Partitioning Mechanisms for Graphs with Different Characteristics.

In computer systems, graph is an important model for describing many things, such as workflows, virtual infrastructures, ontological model etc. Partitioning is an frequently used graph operation in the contexts like parallizing workflow execution, mapping networked infrastructures onto distributed data centers [1], and controlling load balance of resources. However, developing an effective partition solution is often not easy; it is often a complex optimization issue involves constraints like system performance and cost constraints.;

A comprehensive benchmark on graph partitioning mechanisms is helpful to choose a partitioning solver for a specific model. This portfolio can also give advices on how to partition based on the characteristics of the graph. This project aims at benchmarking the existing partition algorithms for graphs with different characteristics, and profiling their applicability for specific type of graphs.;
This project will be conducted in the context of EU SWITCH [2] project. the students will:
  1. Review the state of the art of the graph partitioning algorithms and related tools, such as Chaco, METIS and KaHIP, etc.
  2. Investigate how to define the characteristics of a graph, such as sparse graph, skewed graph, etc. This can also be discussed with different graph models, like planar graph, DAG, hypergraph, etc.
  3. Build a benchmark for different types of graphs with various partitioning mechanisms and find the relationship behind.;
  4. Discuss about how to choose a partitioning mechanism based on the graph characteristics.
Reading material:
  1. Zhou, H., Hu Y., Wang, J., Martin, P., de Laat, C. and Zhao, Z., (2016) Fast and Dynamic Resource Provisioning for Quality Critical Cloud Applications, IEEE International Symposium On Real-time Computing (ISORC) 2016, York UK http://dx.doi.org/10.1109/ISORC.2016.22
  2. SWITCH: www.switchproject.eu

More info: Huan Zhou, Arie Taal, Zhiming Zhao

Zhiming Zhao <z.zhao=>uva.nl>

20

Auto-Tuning for GPU Pipelines and Fused Kernels.

Achieving high performance on many-core accelerators is a complex task, even for experienced programmers. This task is made even more challenging by the fact that, to achieve high performance, code optimization is not enough, and auto-tuning is often necessary. The reason for this is that computational kernels running on many-core accelerators need ad-hoc configurations that are a function of kernel, input, and accelerator characteristics to achieve high performance. However, tuning kernels in isolation may not be the best strategy for all scenarios.

Imagine having a pipeline that is composed by a certain number of computational kernels. You can tune each of these kernels in isolation, and find the optimal configuration for each of them. Then you can use these configurations in the pipeline, and achieve some level of performance. But these kernels may depend on each other, and may also influence each other. What if the choice of a certain memory layout for one kernel causes performance degradation on another kernel?

One of the existing optimization strategies to deal with pipelines is to fuse kernels together, to simplify execution patterns and decrease overhead. In this project we aim to measure the performance of accelerated pipelines in three different tuning scenarios:
  1. tuning each component in isolation,
  2. tuning the pipeline as a whole, and
  3. tuning the fused kernel. Measuring the performance of one or more pipelines in these scenarios we hope to, on one level, being able to determine which is the best strategy for the specific pipelines on different hardware platform, and on another level we hope to better understand which are the characteristics that influence this behavior.
Rob van Nieuwpoort <R.vanNieuwpoort=>uva.nl>

21

Speeding up next generation sequencing of potatoes

Genotype and single nucleotide polymorphism calling (SNP) is a technique to find bases in next-generation sequencing data that differ from a reference genome. This technique is commonly used in (plant) genetic research. However, most algorithms focus on allowing calling in diploid heterozygous organisms (specifically human) only. Within the realm of plant breeding, many species are of polyploid nature (e.g. potato with 4 copies, wheat with 6 copies and strawberry with eight copies). For genotype and SNP calling in these organisms, only a few algorithms exist, such as freebayes (https://github.com/ekg/freebayes). However, with the increasing amount of next generation sequencing data being generated, we are noticing limits to the scalability of this methodology, both in compute time and memory consumption (>100Gb).

We are looking for a student with a background in computer science, who will perform the following tasks:

  • Examine the current implementation of the freebayes algorithm
  • Identify bottlenecks in memory consumption and compute performance
  • Come up with an improved strategy to reduce memory consumption of the freebayes algorithm
  • Come up with an improved strategy to execute this algorithm on a cluster with multiple CPU’s or on GPU/s (using the memory of multiple compute nodes)
  • Implement an improved version of freebayes, according to the guidelines established above
  • Test the improved algorithm on real datasets of potato.
This is a challenging master thesis project on an important food crop (potato) on a problem which is relevant for both science and industry. As part of the thesis, you will be given the opportunity to present your progress/results to relevant industrial partners for the Dutch breeding industry.

Occasional traveling to Wageningen will be required.
Rob van Nieuwpoort <R.vanNieuwpoort=>uva.nl>

22

Auto-tuning for Power Efficiency.

Auto-tuning is a well-known optimization technique in computer science. It has been used to ease the manual optimization process that is traditionally performed by programmers, and to maximize the performance portability. Auto-tuning works by just executing the code that has to be tuned many times on a small problem set, with different tuning parameters. The best performing version is than subsequently used for the real problems. Tuning can be done with application-specific parameters (different algorithms, granularity, convergence heuristics, etc) or platform parameters (number of parallel threads used, compiler flags, etc).

For this project, we apply auto-tuning on GPUs. We have several GPU applications where the absolute performance is not the most important bottleneck for the application in the real world. Instead the power dissipation of the total system is critical. This can be due to the enormous scale of the application, or because the application must run in an embedded device. An example of the first is the Square Kilometre Array, a large radio telescope that currently is under construction. With current technology, it will need more power than all of the Netherlands combined. In embedded systems, power usage can be critical as well. For instance, we have GPU codes that make images for radar systems in drones. The weight and power limitations are an important bottleneck (batteries are heavy).

In this project, we use power dissipation as the evaluation function for the auto-tuning system. Earlier work by others investigated this, but only for a single compute-bound application. However, many realistic applications are memory-bound. This is a problem, because loading a value from the L1 cache can already take 7-15x more energy than an instruction that only performs a computation (e.g., multiply).

There also are interesting platform parameters than can be changed in this context. It is possible to change both core and memory clock frequencies, for instance. It will be interesting to if we can at runtime, achieve the optimal balance between these frequencies.

We want to perform auto-tuning on a set of GPU benchmark applications that we developed.
Rob van Nieuwpoort <R.vanNieuwpoort=>uva.nl>

23

Applying and Generalizing Data Locality Abstractions for Parallel Programs.

TIDA is a library for high-level programming of parallel applications, focusing on data locality. TIDA has been shown to work well for grid-based operations, like stencils and convolutions. These are in an important building block for many simulations in astrophysics, climate simulations and water management, for instance. The TIDA paper gives more details on the programming model.

This projects aims to achieve several things and answer several research questions:

TIDA currently only works with up to 3D. In many applications we have, higher dimensionalities are needed. Can we generalize the model to N dimensions?
The model currently only supports a two-level hierarchy of data locality. However, modern memory systems often have many more levels, both on CPUs and GPUs (e.g., L1, L2 and L3 cache, main memory, memory banks coupled to a different core, etc). Can we generalize the model to support N-level memory hierarchies?
The current implementation only works on CPUs, can we generalize to GPUs as well?
Given the above generalizations, can we still implement the model efficiently? How should we perform the mapping from the abstract hierarchical model to a real physical memory system?

We want to test the new extended model on a real application. We have examples available in many domains. The student can pick one that is of interest to her/him.
Rob van Nieuwpoort <R.vanNieuwpoort=>uva.nl>

24

Ethereum Smart Contract Fuzz Testing.

An Ethereum smart contract can be seen as a computer program that runs on the Ethereum Virtual Machine (EVM), with the ability to accept, hold and transfer funds programmatically. Once a smart contract has been place on the blockchain, it can be executed by anyone. Furthermore, many smart contracts accept user input. Because smart contracts operate on a cryptocurrency with real value, security of smart contracts is of the utmost importance. I would like to create a smart contract fuzzer that will check for unexpected behaviour or crashes of the EVM. Based on preliminary research, such a fuzzer does not exist yet.
Rodrigo Marcos <rodrigo.marcos=>secforce.com>




25

Smart contracts specified as contracts.

Developing a distributed state of mind: from control flow to control structure

The concepts of control flow, of data structure, as well as that of data flow are well established in the computational literature; in contrast, one can find different definitions of control structures, and typically these are not associated to the common use of the term, referring to the power relationships holding in society or in organizations.

The goal of this project is the design and development of a social architecture language that cross-compile in a modern concurrent programming language (Rust, Go, or Scala), in order to make explicit a multi-threaded, distributed state of mind, following results obtained in agent-based programming. The starting point will be a minimal language subset of AgentSpeak(L).

Potential applications: controlled machine learning for Responsible AI, control of distributed computation
Giovanni Sileno <G.Sileno=>uva.nl>
Mosata Mohajeriparizi <m.mohajeriparizi=>uva.nl>


26

Zero Trust Validation.

ON2IT advocates the Zero Trust Validation conceptual strategy [1] to strengthen information security at the architectural level. Zero Trust is often mistakenly perceived as an architectural approach. However, it is, in the end, a strategic approach towards protecting assets regardless of location. To enable this approach, controls are needed to provide sufficient insight (visibility), to exert control, and to provide operational feedback. However, these controls/probes are not naturally available in all environ­ments. Finding ways to embed such controls, and finding/applying them, can be challenging, especially in the context of containerized, cloud­ and virtualized workflows.

At the strategic level, Zero Trust is not sufficiently perceived as a value contributor. At the managerial level, it is perceived mainly as an architectural ‘toy’. This makes it hard to translate a Zero Trust strategic approach to the operational level; there’s a lack overall coherence. For this reason, ON2IT developed a Zero Trust Readiness Assessment framework which facilitates testing the readiness level on three levels: governance, management and operations.

Research (sub)questions that emerge:
  • What is missing in the current approach of ZTA to make it resonate with the board?
    • What are Critical Success Factors for drafting and implementing ZTA?
    • What is an easy to consume capability maturity or readiness model for the adoption of ZTA that guides boards and management teams in making the right decisions?
    • What does a management portal with associated KPIs need to offer in order to enable board and management to manage and monitor the ZTA implementation process and take appropriate ownership?
    • How do we add the necessary controls and leverage control and monitoring facilitities thusly provided efficiently?
  1. Zero Trust Validation
  2. "On Exploring Research Methods for Business Information Security Alignment and Artefact Engineering" by Yuri Bobbert, University of Antwerp
Jeroen Scheerder <Jeroen.Scheerder=>on2it.net>



27

Detection Real time video attack.

A smart attack was investigated which can be performed at the time of recording video only against smart VSS (Video Surveillance System). The main feature of smart VSS is its ability to capture a scene immediately on the detection of particular activity or object. This new design has been made target by attackers who trigger the injection of false frames (false frame injection attack).

The RQ is if how we can detect that this happened afterwards if challenged in court.

Reference:
D. Nagothu, J. Schwell, Y. Chen, E. Blasch, S. Zhu, A study on smart online frame forging attacks against video surveillance system, in: Sensors and Systems for Space Applications XII, Vol. 11017, International Society for Optics and Photonics, 2019, p. 110170 (2019)
Zeno Geradts <zeno=>holmes.nl>



28

OSINT Washing Street.

At the moment more and more OSINT is available via all kinds of sources,a lot them are legit services that are used by malicious actors. Examples are github, pastebin, twitter etc. If you look at pastebin data you might find IOC/TTPS but usually the payloads delivered in many stages so it is important to have a system that follows the path until it finds the real payload. The question here is how can you build a generic pipeline that unravels data like a matryoshka doll. So no matter the input, the pipeline will try to decode, query or perform whatever relevant action that is needed. This would result in better insight in the later stages of an attack. An example of a framework using the method is Stoq (https://github.com/PUNCH-Cyber/stoq), but this lakes research in usability and if the results are added value compared to other osint sources.
Leandro Velasco <leandro.velasco=>kpn.com>
Joao Novaismarques <joao.novaismarques=>kpn.com>


29

Building an open-source, flexible, large-scale static code analyzer.

Background information
Data drives business, and maybe even the world. Businesses that make it their business to gather data are often aggregators of client­side generated data. Client­side generated data, however, is inherently untrustworthy. Malicious users can construct their data to exploit careless, or naive, programming and use this malicious, untrusted data to steal information or even take over systems.
It is no surprise that large companies such as Google, Facebook and Yahoo spend considerable resources in securing their own systems against would­be attackers. Generally, many methods have been developed to make untrusted data cross the trust­boundary to trusted data, and effectively make malicious data harmless. However, securing your systems against malicious data often requires expertise beyond what even skilled programmers might reasonably possess.
Problem description
Ideally, tools that analyze code for vulnerabilities would be used to detect common security issues. Such tools, or static code analyzers, exist, but are either out­dated (http://rips­scanner.sourceforge.net/) or part of very expensive commercial packages (https://www.checkmarx.com/ and http://armorize.com/). Next to the need for an open­source alternative to the previously mentioned tools, we also need to look at increasing our scope. Rather than focusing on a single codebase, the tool would ideally be able to scan many remote, large­scale repositories and report the findings back in an easily accessible way.
An interesting target for this research would be very popular, open­source (at this stage) Content Management Systems (CMSs), and specifically plug­ins created for these CMSs. CMS cores are held to a very high coding standard and are often relatively secure. Plug­ins, however, are necessarily less so, but are generally as popular as the CMSs they’re created for. This is problematic, because an insecure plug­in is as dangerous as an insecure CMS. Experienced programmers and security experts generally audit the most popular plug­ins, but this is: a) very time­intensive, b) prone to errors and c) of limited scope, ie not every plug­in can be audited. For example, if it was feasible to audit all aspects of a CMS repository (CMS core and plug­ins), the DigiNotar debacle could have easily been avoided.
Research proposal
Your research would consist of extending our proof­of­concept static code analyzer written in Python and using it to scan code repositories, possibly of some major CMSs and their plug­ins, for security issues and finding innovative ways of reporting on the massive amount of possible issues you are sure to find. Help others keep our data that little bit more safe.
Patrick Jagusiak <patrick.jagusiak=>dongit.nl>
Wouter van Dongen <wouter.vandongen=>dongit.nl>



30

Developing a Distributed State of Mind.

A system required to be autonomous needs to be more than just a computational black box that produces a set of outputs from a set of inputs. Interpreted as an agent provided with (some degree of) rationality, it should act based on desires, goals and internal knowledge for justifying its decisions. One could then imagine a software agent much like a human being or a human group, with multiple parallel threads of thoughts and considerations which more than often are in conflict with each other. This distributed view contrasts the common centralized view used in agent-based programming,and opens up to potential cross-fertilization with distributed computing applications which for the moment are for the most unexplored.

The goal of this project is the design and development of an efficient agent architecture in a modern concurrent programming language (Rust, Go, or Scala), in order to make explicit a multi-threaded, distributed state of mind.
Giovanni Sileno <G.Sileno=>uva.nl>
Mostafa Mohajeriparizi <m.mohajeriparizi=>uva.nl>


31

Profiling (ab)user behavior of leaked credentials.

Appropriate secret management during software development is important as such secrets are often high privileged accounts, private keys or tokens which can grant access to the system being developed or any of its dependencies. Systems here could entail virtual machines or other software/infrastructure/platform services exposing a service for remote management. To combat mismanagement of secrets during development time, software projects such as HachiCorp Vault or CyberArk Conjure are introduced to provide a structured solution to this problem and to ensure secrets are not exposed by removing them from the source code.

Unfortunately, secrets are still frequently committed to software repositories which has the effect of accidentally ending up in either packages being released or in publicly accessible repositories, such as on Github. The fact that these secrets are then easily accessed (and potentially abused) in an automated fashion has been recently demonstrated by POC projects like shhgit [1].

This research would entail the study and profiling of the behavior of the abuser of such secrets by first setting up a monitoring environment and a restricted execution environment before intentionally leaking secrets online through different channels.

Especially, the research focusses on answering the following questions:
- Can abuse as a result of leaked credentials be profiled?
- Can profiles be used to predict abuser behavior?
- Are there different abusers / patterns / motives for different types of leaked credentials?
- Are there different abusers / patterns / motives for different sources of leaked credentials?
- Can profiles be used to attribute attacks to different attacker groups?

Prior experience with security monitoring and or cloud environments such as AWS / Azure is recommended in order to timely scope the research to a more feasible proposal.

[1] https://github.com/eth0izzle/shhgit
Fons Mijnen <fmijnen=>deloitte.nl>
Mich Cox <mcox=>deloitte.nl>


32

Development of a control framework to guaranty the security of a collaborative open-source project.

We’re now living in an information society, and everyone is expecting to be able to find everything on the Web. IT developers make no exception and spend a large part of their working hours searching for and reusing part of codes found on Public Repositories (e.g. GitHub, Gitlab …) or web forums (e.g. StackOverflow).

The use of open-source software has long been seen as a secure alternative as the code is available for review to everyone, and as a result, bugs and vulnerability should more easily be found and fixed. Multiple incidents related to the use of Open-source software (NPM, Gentoo, Homebrew) have shown that the greater security of open-source components turned out to be theoretical.

This research aims to highlight the root causes of major recent incidents related to open-source collaborative projects, as well as to propose a global open-source security framework that could address those issues.

References:
Tim Dijkhuizen <Dijkhuizen.Tim=>kpmg.nl>
Ruben Koeze <Koeze.Ruben=>kpmg.nl>
Aristide Bouix <Bouix.Aristide=>kpmg.nl>


33

Characterization of the CASB (Cloud Access Security Broker) Technology Market.

CASB are often referred to as the new firewall of the Cloud era; they provide a middleman layer between a company network and a set of web services. However, many companies have been disappointed by their capabilities after implementation.

The objective of this research is to provide an detailed comparison of the current capabilities of the market leaders against the capabilities expected by corporate IT executives. You will have to address the following point:
  • Give an accurate definition of the CASB technology
  • Identify the risks related to Shadow IT and potential mitigation for each of them
  • List the capabilities of the Leaders and Visionaries in the Gartner Quadrant and verify they address the risks related to Shadow IT
  • Give your conclusion on the maturity of this market
NB: The challenge of this subject is to find reliable information. You can start with contacting providers’ sales department and request a demonstration, crossing the references of research publications or interacting on cybersecurity web forums.

Reference: https://www.bsigroup.com/globalassets/localfiles/en-ie/csir/resources/whitepaper/1810-magic_quadrant_for_casb.pdf
Tim Dijkhuizen <Dijkhuizen.Tim=>kpmg.nl>
Ruben Koeze <Koeze.Ruben=>kpmg.nl>
Aristide Bouix <Bouix.Aristide=>kpmg.nl>


34

Analysis of a rarely implemented security feature: signing Docker images with a Notary server.

Notary is Docker's platform to provide trusted delivery of content by signing images that are published. A content publisher can then provide the corresponding signing keys that allow users to verify that content when it is consumed. Signing Docker images is considered as a best security practice, but is little implemented in practice.

The goal of this project is to provide guidelines for safe service implementation. A starting point could be:
  • Get familiar with the service Architecture and Threat Model [1]
  • Deploy a production like service [2]
  • Test the compromise scenarios from the Threat Model
  • Conclude and release a secure production-ready manual and docker-compose template
Reference:
  1. https://docs.docker.com/notary/service_architecture/
  2. https://docs.docker.com/notary/running_a_service/
Aristide Bouix <Bouix.Aristide=>kpmg.nl>
Tim Dijkhuizen <Dijkhuizen.Tim=>kpmg.nl>
Ruben Koeze <Koeze.Ruben=>kpmg.nl>


35

Security of IoT communication protocols on the AWS platform.

In January 2020, Jason and Hoang from the OS3 master worked on the project “Security Evaluation on Amazon Web Services’ REST API Authentication Protocol Signature Version 4”[1]. This project has shown the resilience of the Sigv4 authentication mechanism for HTTP protocol communications.
Since June 2017, AWS released a service called AWS Greengrass[2] that can be used as an intermediate server for low connectivity devices running AWS IoT SDK[3]. This is an interesting configuration as it allows to further challenge Sigv4 authentication on a disconnected environment using the MQTT protocol.

Reference:
  1. https://homepages.staff.os3.nl/~delaat/rp/2019-2020/p65/report.pdf
  2. https://docs.aws.amazon.com/greengrass/latest/developerguide/what-is-gg.html
  3. https://github.com/aws/aws-iot-device-sdk-python
Aristide Bouix <Bouix.Aristide=>kpmg.nl>
Tim Dijkhuizen <Dijkhuizen.Tim=>kpmg.nl>
Ruben Koeze <Koeze.Ruben=>kpmg.nl>


36 Developing a transparent high-speed Intrusion detection and Firewall/ACL mechanism (a plus would be if it can be based on AI/Machine learning algorithms).
We have some experimental Mellanox Networking technology that needs to be used in this RP.
The goal of this project is figure out how to build a transparent automated IDS and Firewall/ACL mechanism that can handle large amounts of traffic (100Gbit/s or more).
Some starting points are:
  • Get to know things like OVS, DPDK and VPP
  • Test different technologies for high speed IDS
  • Test different technologies for high speed Firewalling/ACL's and offloading into hardware (Mellanox Based)
  • Figure out a way how to use AI/Machine learning algorithms to speed up the process.
Cedric Both <cedric=>datadigest.nl>

37

Scalable security applying IDS, antivirus and similar monitoring VPN connections.

We would like to figure out a scalable security solution with IDS, antivirus and other mechanisms on monitoring incoming and outgoing traffic of different VPN connections. We have VPN servers running based on Open Source software (e.g. OpenVPN and IPSec) with lots of users and troughput and we would like to have a automated way in transparently monitoring all user traffic, detection of anomalies when a user for example access a malicious website or downloaded something that they shouldn't have and per user reporting.
The goal of this project is to give security information back to the user that is using the VPN connection to know if the data that is going over the VPN connection is "Safe and Clean".
Some starting points of this RP are:
  • Research how VPN traffic flows can be scanned based on individual users
  • Test different technologies for high traffic and high user IDS/antivirus scanning etc.
  • Figure out what the most secure way is in giving back the security information to each user.
Cedric Both <cedric=>datadigest.nl>

38

Threat modeling on a concrete Digital Data Market.

Security and sovereignty are top concerns for data federations among normally competing organizations. Digital Data Marketplace (DDMs) are emerging as architectures to support this mode of interactions. We have designed an auditable secure network overlay for multi-domain distributed applications to facilitate such trust-worthy data sharing. We prove our concepts with a running demonstration which shows how a simple workflow can run across organizational boundaries.

It is important to know how secure the overlay network is for data federation applications. You will do a threat modeling with a concrete DDM use case.
  • Discover a detailed outline of attack vectors.
  • Investigate which attacks are already successfully detected or avoided with current secure mechanisms.
  • Discover remaining security holes and propose possible countermeasures.
"Zhang, Lu" <l.zhang2=>uva.nl>

39

Azure Elastic SQL Evaluation.

Solvinity is a secure hybrid cloud provider, with many large public sector / private sector clients from the Netherlands. We work with Microsoft solutions but we have to decide whether a new product is worth the effort to offer to our clients. We would like students to stress test and evaluate the new Microsoft Azure SQL database solutions focusing on but not limited to the new Elastic Database solution.

High level overview:

- Select use case and create good simulation of use case together with Solvinity engineering.
- Create similar to reference / new methodology to database testing. Potentially others.
- Deploy above created use case and methodologies on Azure cloud services, in the three different technologies offered by Azure.
- Analyse aspects of deployment, maintainability, security etc. together with engineering.
- Write report.

Some reference work:

https://mikhail.io/2018/02/load-testing-azure-sql-database-by-copying-traffic-from-production-sql-server/
https://blog.pythian.com/benchmarking-google-cloud-sql-instances/
Peter Prjevara <peter.prjevara=>solvinity.com>


40

Version management of project files in ICS.

Research in Industrial Control Systems: It is difficult to have proper version management of the project files as they usually are stored offline. We would like to come up with a solution to backup and store project files in real time on a server and have the capability to revert back/take snapshots etc. of the versions used. Sort of Puppet/Chef/Ansible but then for ICS.
<mvanveen=>deloitte.nl>
Mohamed Abdelkader <moabdelkader=>deloitte.nl


41

Real time asset inventory in ICS.

Research in Industrial Control Systems: We know how to do passive asset inventory within ICS and can create an “as-is” network diagram.
  • But how do you keep it updated while new assets are being added to the network?
ICS is more static than IT environment, but a real time asset discovery tooling could be very interesting in some cases.
Michel van Veen <mvanveen=>deloitte.nl>
Mohamed Abdelkader <moabdelkader=>deloitte.nl


42

SDN in ICS.

Research in Industrial Control Systems: Splitting the automation data plane from the network management plane,
  • is it possible in ICS?
  • What are the limitations?
  • What are the benefits?
<mvanveen=>deloitte.nl>
Mohamed Abdelkader <moabdelkader=>deloitte.nl

Marios Andreou <Marios.Andreou=>os3.nl>
R

P
2
43

Future tooling and cyber defense strategy for ICS.

Research in Industrial Control Systems: Is zero trust networking possible in ICS? This is one of the questions we are wondering about to sharpen our vision and story around where ICS security is going and which solutions are emerging.
<mvanveen=>deloitte.nl>
Mohamed Abdelkader <moabdelkader=>deloitte.nl


44

Involuntary browser based torrenting (WebRTC).

WebRTC research: WebRTC allows for browser-based P2P torrenting.
  • Can this function be misused to let visitors of webpages involuntarily participate in P2P networks?
  • For example to share illegal or protected media?
  • If so, is this an already established and (widely) used tactic?
Possible addition can also be a look into how such attacks be performed and a Proof-of-Concept.
Cedric van Bockhaven <CvanBockhaven=>deloitte.nl>
Jan Freudenreich <jfreudenreich=>deloitte.nl>

Alexander Bode <alexander.bode=>os3.nl>
R

P
2
45

End-to-end encryption for browser-based meeting technologies.

Investigating the possibilities and limitations of end-to-end encrypted browser-based video conferencing. With a specific focus on security and preserving privacy.
  • What are possible approaches?
  • How would they compare to each other?
Jan Freudenreich <jfreudenreich=>deloitte.nl>



46

Evaluation of the Jitsi Meet approach for end-to-end encrypted browser-based video conferencing.

Determining the security of the library, implementation and the environment setup.
Jan Freudenreich <jfreudenreich=>deloitte.nl>

47

Acceleration of Microsoft SEAL library using dedicated hardware (FPGAs).

Homomorphic encryption allows to process encrypted data making it possible to access services sharing only encrypted information to the service provider. However, performance of homomorphic encryption can be limited for certain applications. The Microsoft Simple Encrypted Arithmetic Library (Microsoft SEAL) is an homomorphic encryption library (https://github.com/Microsoft/SEAL) released as opensource under the MIT license. The overall goal of the project is to improve the performance of the library accelerating it by means of dedicated hardware.

The specific use case consider here is the acceleration of particularly critical routines using reconfigurable hardware (FPGA). The research will address the following challenges:

  • The profiling of library components to identify the main bottlenecks. The functions that would benefit more from the hardware acceleration will be identified and a small subset of them would be selected.
  • An optimized hardware architecture for the functions previously selected will be designed and implemented using an hardware description language (VHDL or Verilog)
  • The performance of the hardware design will be evaluated using state of the art design tools for reconfigurable hardware (FPGA) and the speed up achieved on the overall library will be estimated
Francesco Regazzoni <f.regazzoni=>uva.nl>


48

High level synthesis and physical attacks resistance.

High level synthesis is a well known approach that allows designers to quickly explore different hardware optimization starting from and high level behavioral code. Despite its widespread use, the effects of such an approach on security have not been explored in depth yet. This project focuses on physical attacks, where the adversary infers secret information exploiting the implementation weaknesses, and aims at exploring the effect of different optimizations of high level synthesis on physical attacks.

The specific use case consider here is the analysis of resistance against physical attacks of particularly critical blocks of cryptography algorithms when implemented in hardware using high level synthesis. The research will address the following challenges:

  • The selection of the few critical blocks to be explored and the implementation of them using high level behavioral language
  • The realization of different versions of the previously selected blocks using high level synthesis tools.
  • The collection (or the simulation) of the traces needed to mount the side channel attack
  • The security analysis of each version of each block and the analysis of the effects on security of the particular optimization used to produce each version.
Francesco Regazzoni <f.regazzoni=>uva.nl>

49

Embedded FPGAs (eFPGAs) for security.

Reconfigurable hardware offers the possibility to easily reprogram its functionalities on the field, making it suitable for applications where some flexibility is required. Among these applications, there is certainly cryptography, especially when implemented in embedded and cyber-physical systems. These devices, often, have a life time that is much longer than the usual consumer electronics, thus they need to provide the so called crypto-agility (the capability of update an existing cryptographic algorithm). Reconfigurable hardware is currently designed for general purpose, while better performance could be reached by using reconfigurable blocks specifically designed for cryptography (https://eprint.iacr.org/2018/724.pdf). In this project, a design flow open source design tools have to be explored to build a design flow for converting HDL designs into a physical implementation of the algorithm on the novel reconfigurable block.

The specific use case considered here is the exploration of possible architectures for connecting the novel reconfigurable block and the estimation of the overhead of the connections. The research will address the following challenges:

  • acquire familiarity with the VTR tool and other relevant design tools through discussion with the supervisor, online tutorials and example implementations,
  • develop a custom design flow for the reconfigurable block presented by Mentens et al.
  • validate the design flow and the eFPGA architecture though a number of existing cryptographic benchmark circuits.

This thesis is in collaboration with KU Leuven (Prof. Nele Mentens)
Francesco Regazzoni <f.regazzoni=>uva.nl>

50

Approximate computing and side channels.

Approximate computer is an emerging computing paradigm where the precision of the computation is traded with other metrics such as energy consumption or performance. This paradigm has been shown to be effective in various application, including machine learning and video streaming. However, the effect of approximate computing on security are still unknown. This project investigates the effects of approximate computing paradigm on side channel attacks.

The specific use case consider here is the exploration of the resistance against power analysis attacks of devices when classical techniques used in the approximate computing paradigm to reduce the energy consumption (such as voltage scaling) are applied. The research will address the following challenges:

  • Selection of the most appropriated techniques for energy saving among the ones used in approximate computing paradigm
  • Realization of a number of simple cryptographic benchmarks using HDL (VHDL of Verilog) language
  • Simulation of the power consumption in the different scenarios
  • Evaluation of the side channel resistance of each
This thesis is in collaboration with University of Stuttgart (Prof. Ilia Polian)
Francesco Regazzoni <f.regazzoni=>uva.nl>

51

Decentralize a legacy application using blockchain: a crowd journalism case study.

Blockchain technologies demonstrated a huge potential for application developers and operators to improve service trustworthiness, e.g., in logistics, finance and provenance. The migration of a centralized distributed application into a decentralized paradigm often requires not only a conceptual re-design of the application architecture, but also profound understanding of the technical integration between business logic with the blockchain technologies. This project, we will use the social network application (crowd journalism) as a test case to investigate the integration possibilities between a legacy system and the blockchain. Key activities in the project:
  1. investigate the integration possibilities between social network application and permissioned blockchain technologies,
  2. make a rapid prototype to demonstrate the feasibility, and
  3. assess the operational cost of blockchain services.
The software of the crowd journalism will be provided by a SME partner of EU ARTICONF project.

References: http://www.articonf.eu
Zhiming Zhao <z.zhao=>uva.nl>


52

Location aware data processing in the cloud environment.

Data intensive applications are often workflow involving distributed data sources and services. When the data volumes are very large, especially with different access constraints, the workflow system has to decide suitable locations to process the data and to deliver the results. In this project, we perform a case study of eco-Lida data from different European countries; the processing will be done using the test bed offered by the European Open Science Cloud. The project will investigate data location aware scheduling strategies, and service automation technologies for workflow execution. The data processing pipeline and data sources in the use case will be provided by partners in the EU Lifewatch, and the test bed will be provided by the European Open Science Cloud earlier adopter program.
Zhiming Zhao <z.zhao=>uva.nl>

53

Conditional Access and Log Correlation Engine.

Solvinity is a secure hybrid cloud provider, with many large public sector / private sector clients from the Netherlands. We work with Microsoft solutions but we have to decide whether a new product is worth the effort to offer to our clients, or implement in our systems. One of the big selling points of Microsoft is that they offer automated log analysis and correlation and based on that conditional access rules applied directly to Active Directory and LDAP systems

We would like SNE students to examine the possibility to implement Log Correlation using one of the available open source tools such as SNORT or DSIEM (https://github.com/defenxor/dsiem) or SAGAN (https://github.com/beave/sagan), and tie it back to our on premise Active Directory solution. Additionally we would like to know if it's possible to identify using the log correlation engine:
  • Malicious login attempts with certainty
  • The use of unapproved Shadow IT applications within the company or clients
  • Data leaks or attempts of data exfiltration
Péter Prjevara <peter.prjevara=>gmail.com>


54

Research ipv6 over ipv4 tunnel mechanisms.

Identify possible ways to hide traffic using tunneling mechanisms.The aim of this research is to identify ways to bypass firewalls using IPv4 over IPv6 tunneling. The first step is to understand how the various protocols of tunneling IPv6 over IPv4 work by looking at RFCs and prior research. Subsequently a specific software implementation of a popular firewall vendor can be picked and evaluated whether it correctly implements the tunneling protocol. In case deviations are identified, you can look into how these deviations can possibly be exploited.
Arris Huijgen <Ahuijgen=>deloitte.nl>


55

Identify IPv6 addresses based on a public IPv4 address.

The aim of this research is to potentially bypass firewalls by using the IPv6 address instead of the IPv4 address because firewalls might only be blocking traffic on IPv4 and not on IPv6. For that we do need an approach to potentially identify the IPv6 counterpart of a IPv4 address.

The result of this research will be a description on how IPv4 addresses map to IPv6 addresses (for example by looking at ASNs) and a practical approach how to identify an IPv4 address based on its IPv6 address (or vice versa).
Arris Huijgen <Ahuijgen=>deloitte.nl>

55

Improving Red Team Reconnaissance.

During red teaming exercises it is of vital importance for the red team to know when the blue team has recognised their actions and are investigating their artefacts. Having such knowledge gives the red team the opportunity to either brace for impact, clean up their channels and lay low, change C2-channels or otherwise adjust their attacks to perhaps remain hidden for the blue team. This is all done in order to be a better sparring partner for the blue team and give them a better training. During our red team exercises we make use of many different ways of detecting blue team activities. As we believe the entire red teaming industry needs to improve we have open sourced some of these checks into our RedELK tooling(1). More info on our approach and details of RedELK here(2).

We are always looking for new novel ways for blue team detection. Lately, research was disclosed where adwords are used for this purpose (3). We want students to investigate the feasibility of using adwords for blue team activity, as well as have a fully working PoC. The analyses should include effectiveness as well as ease of setup and possibility of including into RedELK.

We are open to other novel ways for detection of blue team activities. This RP can easily be changed to your liking if you have another novel technique and hypothesis that fits the end goal. Get in touch if you have such an idea.

Students preferably already have experience in either offensive or defensive IT operations.
Marc Smeets <marc=>outflank.nl>

56

The Serval Project; Making a low-cost, scalable tsunami and all-hazards warning system with integrated FM radio transmitter.

The Sulawesi earthquake reminded us of the significant gap that exists between the generation of tsunami (and other hazard) warnings, that works quite well, and the means of getting those warnings out to those in the small isolated coastal communities that need them.; There is a need for a low-cost and scalable solution to providing early warning capabilities. Such a system also needs to be useful year-round, so that it will be maintained and work when needed. For this reason, we are building a FM radio juke-box into the system.; In this project, you will help to advance this project from proof-of-concept to prototype stage, through assisting with the generation of the radio juke-box client software as well as the back-end middle-ware for feeding alerts to the satellite up-link, and receiving them on the terminal equipment.
Paul Gardner-Stephen <paul.gardner-stephen=>flinders.edu.au>

57

The Serval Project; Shaking down the Serval Mesh Extender

The Serval Mesh Extender is a ruggedised solar-powered peer-to-peer mesh communications system that allows isolated communities to have local communications, and through interconnection to HF digital radios and other means, to connect those communities together.; The system now largely works, but effort is required to more thoroughly test the system under realisitic conditions, and to document, and then eliminate software issues that interfere with the efficient operation of the system.; This will occur through interaction with a remotely accessed semi-automatic test bed network.
Paul Gardner-Stephen <paul.gardner-stephen=>flinders.edu.au>

58

The Serval Project; Security through Simpllicity: Creating an open-source smart-phone like device.

SPECTRE and MELTDOWN, and the 20 years it took to discover the vulnerabilities, have unambiguously shown that the complexity of modern computing devices has grown to the point where verification of security is simply impossible. Yet we still have need for strong assurances of security to support many uses of modern technology.; However, things were not always like this. Computers of the 80s and 90s were simple enough that both hardware and software could be verified. Therefore we are creating an open-source smart-phone like device based on an improved evolution of the well known Commodore 64 architecture implemented in FPGA.; We have working bench-top prototypes, and are moving to prototype hardware, and are looking for both IT/CS as well as electronic engineering students to help move the project to prototype stage, and to test the hardware, and implement the software, so that we can have usable test devices in 2019.
Paul Gardner-Stephen <paul.gardner-stephen=>flinders.edu.au>

59

The current state of DNS Lame delegations; An analysis of the current state of lame delegations within the Swedish .se tld.

The Domain Name System (DNS), described by Paul Mockapetris in rfc1034 [1], provides a mechanism for naming resources in such a way that the human-readable host names are usable on different hosts and networks as to machine-interpretable addresses. The DNS system was designed by engineers with a focus on scalability rather than security. The security requirement came with the availability of the internet to the general public in the 1990’s. At that time, DNS already acquired an indispensable position within the internet information structure.

This DNS system is used within email to locate the mail agent which belongs to the hostname used in the email address. This is done by mail exchange resource record (MX) within the DNS zone. This MX record specifies which host has the mail agent for the domain and this agent should accept mail for forwarding to the domain, possibly together with other delegations like SRV records for IMAP, POP and VoIP.
David Barr[2] describes errors often found within the operation of DNS in rfc1912.

One of the errors which have been described are lame delegations. Lame delegation are delegations like the MX record which delegate functionality to another entity without the other entity in place to serve this request. Some of them which are based on mis-configuration others on configuration changes or legacy. In this paper, we will study the impact of this lame delegations on the DNS system related to internet services like e-mail delivery.
Arris Huijgen <AHuijgen=>deloitte.nl>

Alexander Blaauwgeers <alexander.blaauwgeers=>os3.nl>
R

P
2
60

Analysis of a new privacy technology and its implementation: “Image Cloaking”.

The last decade has seen the rise of new image recognition technologies that, combined with the rise of IoT and the miniaturization of digital cameras, potentially endanger our traditional notion of personal privacy. In this context, a team of researchers from the SAND laboratory at the University of Chicago released a python tool called Fawkes, intending to disrupt how facial recognition algorithms relate between multiple images.
 
Limiting this technology's use to only a few tech-savvy individuals would not impact the current trend. Also, it is crucial to make sure that the technology work and can be deployed at scale. The goal of this project is to pave the way to broader adoption of such technologies on online uploading platforms. A proposed scientific demarche could be:
  • Verify a few results from the research paper, to confirm the code work as intended
  • Build a Proof of Concept photo website where uploaded face pictures would be cloaked on-the-fly using Fawkes
Optimize the configuration of the cloaking feature to get a seamless experience or proposed architectural workarounds
 
Reference:
Aristide Bouix <Bouix.Aristide=>kpmg.nl>
Tim Dijkhuizen <Dijkhuizen.Tim=>kpmg.nl>


61

Wi-Fi 6 - BSS colouring in the home environment

BSS Colouring aims to significantly improve the end-user experience in terms of throughput and latency in high density wireless environments, for example in urban areas. A major cause of slow working in dense Wi-Fi environments is mutual interference between access points that share the same channel. Wi-Fi copes with this co-channel interference (CCI) by Carrier Sense with Multiple Access Collision Avoidance (CSMA/CA): a radio wanting to transmit first listens on its frequency, and if it hears another transmission in process it waits a while before trying again. CCI is not actually an interference but more a sort of congestion. It hinders the performance by increasing the wait time as the same channel is used by different devices. The CCI forces other devices to defer transmissions and wait in a queue until the first device finishes using the transmission line and the channel is free. Even if two APs are too far apart for them to detect each other's transmissions directly, a client of either in between them can effectively trigger collision avoidance when one AP hears it talking to the other. Unnecessary medium contention overhead that occurs when too many Access Points (APs) and clients hear each other on the same channel is called an overlapping basic service set (OBSS).

BSS colouring is a feature in the IEEE 802.11ax standard to address medium contention overhead due to OBSS by assigning a different "colour", a number between 1 and 63 that is added to the PHY header of the 802.11ax frame, to each BSS in an environment. When an 802.11ax radio is listening to the medium and hears the PHY header of an 802.11ax frame sent by another 802.11ax radio, the listening radio will check the BSS colour bit of the transmitting radio. Channel access is dependent on the colour detected:
  • If the colour bit is the same, then the frame is considered an intra-BSS transmission, and the Preamble Detection (PD) threshold remains unchanged, in other words normal CSMA/CA process is followed.
  • If the colour is different, then the frame is considered an inter-BSS transmission. The station increases its PD threshold to limit the range of physical carrier sense so as to reduce the chance of contention with the neighbour AP.
This is important because the number of Wi-Fi stations being used continues to increase, and the space between them is decreasing. BSS colouring gives the 802.11ax standard, also labelled Wi-Fi 6 by the Wi-Fi Alliance, the ability to discover spatial reuse opportunities. And spatial reuse can be exploited to enable more parallel conversations within the same physical space.

Our expectation is BSS Colouring will increase Medium Access Control (MAC) efficiency and the user should experience lower latency and an increase in throughput. The noise floor of the operating channel however will deteriorate, reducing the Signal-to-Noise Ratio (SNR) potentially resulting in a drop in the Modulation and Coding Scheme (MCS) rate. In short, BSS colour improves MAC efficiency at the cost of noise in the physical (PHY) layer. Legacy 802.11a/b/g/n clients and APs will not be able to interpret the colour bits because they use a different PHY header format.

The main research questions are:
  • What is the impact of using BSS colouring on the performance in terms of throughput and latency of the own and neighbour BSS taking the increased MAC efficiency as well as the increased noise into account?
  • What is the impact of introducing BSS colouring in the home environment in combination with neighbouring legacy APs and clients?
  • How does the increased use of mesh Wi-Fi solutions combine with the introduction of BSS colouring?
The students will use the MATLAB WLAN toolbox for simulation, knowledge of MATLAB and the Wi-Fi PHY and MAC layer is key for this research project.
"Vegt, Arjan van der" <avdvegt=>libertyglobal.com>


62



63




Presentations-rp2

I hereby would like to invite you to the annual RP2 presentations, where the SNE students will be presenting their research.
Considering the wide variety of presentations the day promises to be very interesting and we hope you will join us.
Program (Printer friendly version: HTML).
Wednesday July 7 2021, 10h05 - 16h55 online using bigbluebutton.
Time #RP Title Name(s) LOC RP
10h00




10h05





10h30





10h55




11h05




11h30




11h55




13h05




13h30




13h55




14h05




14h30




14h55




15h05




15h30




15h55




16h05




16h30




16h55





Thursday July 8 2021, 10h05 - 16h55 online using bigbluebutton.
Time #RP Title Name(s) LOC RP
10h00




10h05





10h30





10h55




11h05




11h30




11h55




13h05




13h30




13h55




14h05




14h30




14h55




15h05




15h30




15h55




16h05




16h30





Presentations-rp1

Program (Printer friendly version: HTML, PDF.
(
presentations are 20 min for single and 25 min for pairs of students, yellow = requested specific day/time.)
Monday Feb 3 2020, 10h25 - 17h00 in room B1.23 at Science Park 904 NL-1098XH Amsterdam.
Time #RP Title Name(s) LOC RP
10h25




10h25





10h50




11h10




11h35




12h00




13h00




13h25




13h45




14h10




14h35




15h00




15h20




15h45




16h10




16h30





Tuesday feb 4th 2020, 10h00 - 17h00 in room B1.23 at Science Park 904 NL-1098XH Amsterdam.
Time #RP Title Name(s) LOC RP
10h00




10h00




10h25





10h50




11h10




11h35




12h00




13h00




13h25




13h50




14h10




14h35




15h00




15h20




15h40




16h00




16h20






Out of normal schedule presentations: Room B1.23at Science Park 904 NL-1098XH Amsterdam. Program:
Date Time Place #RP Title Name(s) LOC RP #stds
2020-09-16
10h00
online
59
The current state of DNS Lame delegations; An analysis of the current state of lame delegations within the Swedish .se tld. Alexander Blaauwgeers Deloitte
2
1