Angel Kodituwakku

Computer Engineering professional with a proven track record of large-scale problem solving ability and extensive research skills in cybersecurity, experienced in software architecture design, databases, visualization, AI/ML and DL.


Education

The University of Tennessee, Knoxville, TN


PhD in Computer Engineering

August 2019 - May 2021

Concentration: Cybersecurity

Dissertation: InSightNG: A System to Detect Spoofing in Computer Networks Using Per-host Behavior-based Host Descriptors

GPA: 3.75


• The objective of this research is to develop a system to uniquely identify hosts in a network without relying on addresses such as MAC/IP which can be used to detect address spoofing.

• I developed a unique identifier based on the bahavior of the host called a Host Descriptor (HD).

• HD is a statistical representative model of a host rather than an arbitraty address.

• HD stores MAC and IP addresses that it sees over time that the host was identified as.

• This information is made available for the security analyst in the HD table or HDT so that they can look up all the MAC and IP addresses the host used and their time stamps.

• InSight next generation or InSightNG is the distributed sensor-based system that we developed to collect the host and network data, calculate the HDs per host and keeps track of each host.

• Using this system we show that we are able to detect address spoofing in real-time.


MS in Computer Engineering

Jan 2016 - Aug 2017

Concentration: Computer Networks

Thesis: InSight2: An Interactive Web-based Platform for Modeling and Analysis of Large Scale Argus Network Flow Data

GPA: 3.59


• I developed and currently maintain InSight2, a real-time situational awareness and flow analytics platform for large scale networks.

• It ingests Argus flow data, either real-time streaming via a network socket or reading from Argus files, enriches them with contextual information.

• It is highly scalable providing a multithreaded backend for data enrichment.

• It is extendible by providing third-party modules access to use the enriched data.

• It also has data export functionality for the use with external data analysis tools.

• It uses Elasticsearch database / search engine for fast responses to analyst queries and Kibana for real-time visualizations.

• It provides quick access to visualizations for different use-cases using a custom web-based front-end.


The University of Moratuwa, Moratuwa, Sri Lanka


BSc Honors in Electronics and Telecommunication Engineering

August 2010 - May 2015


Dean’s List award (Dec 2010)

Experience

Work Experience


Research Associate III

August 2017 - August 2019

Employer: The University of Tennessee, Knoxville, TN

Project: InSight2 R&D Project (NSF IRNC/AMI/1450959 | PI: Dr. Jens Gregor)


• Prototyped and optimized novel InSight2 network situational awareness and analytics platform (with over 40x performance improvements over the original InSight) using a novel parallel-processing architecture and tested it on GLORIAD R&E Network flow data. (InSight2 Overview PDF)

• Led a team of seven graduate and undergraduate level students as the lead developer to implement InSight2.

• Deployed and maintained InSight2 at Stanford University, University of Tennessee, Queen’s University, KISTI and a French institute for real-time network monitoring, visualization, analytics and anomaly detection.

• Developed predictive analytics modules using the Markov-Chain algorithm and analyzed Stanford University SoE flow data and WRCCDC CTF 2019 Competition data using InSight2 and published a journal paper in collaboration with Alex Keller.

• Developed a deep learning model to detect compromised IP addresses in enriched flow data utilizing TensorFlow and Python with 97.7% accuracy.

• Developed four offensive techniques targeting industrial control systems, generated data, trained ML models to detect them and published a journal paper in collaboration with the Nuclear Engineering Dept. at UT.

• Architected the novel hybrid distributed federated InSightNG platform and sensor network for large-scale deployments to dramatically reduce time-to-detection of network threats which I extended as my PhD work.

Intership


Research Intern

December 2013 – June 2014

Employer: GLORIAD R&E Network

Project: InSight Advanced Performance Measurement System (NSF IRNC/ProNet/0963058 | PI: Greg S. Cole)


• Developed two real-time analytics modules to detect network scanning and spamming activity in Argus flow data.

• Contributed to NSF grant proposal IRNC/AMI/1450959 which later funded my MS program.

Skills and Certs

Programming


Proficient


Python

JS

Experienced


C++

JS

Java

PHP

SQL

HTML


Tools and Libraries



Git

TensorFlow

Cisco VIRL

Matlab

AWS

ScikitLearn

Argus

Docker

GCP

Matplotlib

NetFlow

Jupyter

Elasticsearch

Linux

Pen Test Tools

Puppet


Certifications


Cisco Certified Network Associate (CCNA)


• Routing & Switching

• Verification: 415552904029ITUG



Chartered Management Accountant (Foundation Level)


• Chartered Institute of Management Accountants


Milestones

BS (Hons)

Electronics and Telecommunication Engineering


Internship

Internship: Research Intern


GLORIAD R&E Network

MS

Computer Engineering, concentration: Computer Networks.


Research Associate

Develop InSight2 Network Monitoring and Analysis Platform


PhD

Computer Engineering, concentration: Cybersecurity


Journals and Conferences

Journals


InSight2: A Modular Visual Analysis Platform for Network Situational Awareness in Large-Scale Networks


https://www.mdpi.com/2079-9292/9/10/1747/htm The complexity and throughput of computer networks are rapidly increasing as a result of the proliferation of interconnected devices, data-driven applications, and remote working. Providing situational awareness for computer networks requires monitoring and analysis of network data to understand normal activity and identify abnormal activity. A scalable platform to process and visualize data in real time for large-scale networks enables security analysts and researchers to not only monitor and study network flow data but also experiment and develop novel analytics. In this paper, we introduce InSight2, an open-source platform for manipulating both streaming and archived network flow data in real time that aims to address the issues of existing solutions such as scalability, extendability, and flexibility. Case-studies are provided that demonstrate applications in monitoring network activity, identifying network attacks and compromised hosts and anomaly detection.


Multilayer Data-Driven Cyber-Attack Detection System for Industrial Control Systems Based on Network, System, and Process Data


https://ieeexplore.ieee.org/abstract/document/8604075 The growing number of attacks against cyber-physical systems in recent years elevates the concern for cybersecurity of industrial control systems (ICSs). The current efforts of ICS cybersecurity are mainly based on firewalls, data diodes, and other methods of intrusion prevention, which may not be sufficient for growing cyber threats from motivated attackers. To enhance the cybersecurity of ICS, a cyber-attack detection system built on the concept of defense-in-depth is developed utilizing network traffic data, host system data, and measured process parameters. This attack detection system provides multiple-layer defense in order to gain the defenders precious time before unrecoverable consequences occur in the physical system. The data used for demonstrating the proposed detection system are from a real-time ICS testbed. Five attacks, including man in the middle (MITM), denial of service (DoS), data exfiltration, data tampering, and false data injection, are carried out to simulate the consequences of cyber attack and generate data for building data-driven detection models. Four classical classification models based on network data and host system data are studied, including k-nearest neighbor (KNN), decision tree, bootstrap aggregating (bagging), and random forest (RF), to provide a secondary line of defense of cyber-attack detection in the event that the intrusion prevention layer fails. Intrusion detection results suggest that KNN, bagging, and RF have low missed alarm and false alarm rates for MITM and DoS attacks, providing accurate and reliable detection of these cyber attacks. Cyber attacks that may not be detectable by monitoring network and host system data, such as command tampering and false data injection attacks by an insider, are monitored for by traditional process monitoring protocols. In the proposed detection system, an auto-associative kernel regression model is studied to strengthen early attack detection. The result shows that this approach det...

Conferences


FloCon 2018


https://flocon2018.sched.com/event/CNc2/insight2-an-interactive-web-based-platform-for-modeling-and-analysis-of-large-scale-argus-network-flow-data

Network monitoring systems are paramount to the proactive detection and mitigation of problems in computer networks related to performance and security. Degraded performance of network equipment and compromised end-nodes can cost computer networks downtime, data loss, and reputation. InSight2 is a web-based platform developed for the purpose of proactive and predictive monitoring of network performance and security aspects and providing intuitive visualizations thereof in organized dashboards in near real time. InSight2 models and analyzes network transactions to provide insight in to the network performance such as current bandwidth utilization, packet rate, packets dropped and the number of nodes online. InSight2 also uses up-to-date emerging threat lists and data analytics to identify denial of service attacks, botnets, ransomware servers, bogons, compromised hosts, spammers, scanners and a host of other types of malicious agents in the network. All data is automatically tagged with geographical, organizational, and other related information for identification and further investigation. InSight2 processes Argus flow records which provide information such as number of bytes and packets transmitted, number of packets lost and retransmitted, jitter, and inter-packet delay for each flow. Emerging threats are extracted from multiple up-to-date repositories to build a threats database which is used to enrich each flow by adding one or more searchable tags. InSight2 utilizes MaxMind GeoIP to add geographical information such as country and city information as well as latitude-longitude coordinates which are used to plot the source and destination nodes in interactive global maps. The Global Science Registry from the GLORIAD project is used to enrich network flows with organizational information. Elasticsearch serves as the back-end database and search engine. An associated Kibana module handles the data visualization. Markov Chains are used to predict network activity based on past behavior. InSight2’s front-end incorporates user authentication, SSL encryption, and isolation of the dashboard controls from the end user by displaying the dashboards in a modern and unified web-interface that allows the network administrator to show customized information based on user privileges. InSight2 runs under any Linux operating system as a system service. An installer is provided that requires minimal user interaction. InSight2 includes a user guide and a video tutorial to get the users up to speed with installation and usage quickly. Development of InSight2 is supported by the National Science Foundation under Grant No. IRNC-1450959.


FloCon 2019


https://flocon2019.sched.com/event/JKSP/insight2

Network throughput and complexity are increasing due to the increasing number of devices and data-driven applications, especially at universities and Research and Education (R&E) Networks. In this talk we present InSight2, an open platform, intended to monitor and facilitate the development of network analytics for these large-scale networks. University and R&E networks are facing a deficiency in operational and security awareness. Real-time behavioral visibility and analysis of networks are crucial to detect problems, predict patterns and protect the data and critical assets. Conventional monitoring techniques and tools do not scale well in these environments. Novel analytics must be developed to understand traffic behavior and security issues, addressing the complexity and throughput of these networks. Network managers, operators and analysts face difficulty finding tools to analyze the amount of the data they collect. Researchers and educators encounter a barrier to entry to develop network analytics. These issues can be addressed by an open platform, that facilitates collaboration among the global community for the development and improvement of network analytics. We present two analytics modules. The predictive analytics module forecasts network utilization and enables the detection of unexpected behavior. The botnet detection module identifies botnet activity in network traffic. Results from its various deployments as well as benchmarks are also presented.


FloCon 2020


https://flocon2020.sched.com/event/WrvV/using-deep-neural-networks-to-detect-compromised-hosts-in-large-scale-networks

Detecting compromised hosts in networks is an important cyber security challenge. Investing in defenses on the perimeter of the network is key to prevent compromises within the network. However, hosts are compromised at an alarming rate due to security breaches and insider threats. It is becoming impossible for network security analysts to keep up with the barrage of data to manually detect compromises. Automating the detection of compromises and providing decision support play a key role in optimizing the analyst's workflow. Various statistical modeling techniques have been proposed to assist analysts with detecting compromised hosts by examining their behavior on the network at flow level. But most of this research lacks real datasets that reflect modern attacks, preventing their use in real-world scenarios. Literature tends to use benchmark data sets that are simulated and outdated. In this presentation, we discuss the generation of a new dataset based on recent, real network data from global research and education that is fused with actual threat lists and contextual information. This augmented data set provides ground truth in training supervised statistical models. We describe the development of a statistical model based on deep neural networks. Using these cutting-edge modeling techniques, we were able to detect compromised hosts in a network using the InSight2 platform at a high accuracy and low false positive rate. Compared to existing statistical models, our model is readily deployable in wide range of networks, since it has been developed using real-world data. We present case studies based on its deployments at academic institutions and explore its impact in real-world applications from both academic and industrial viewpoints. These case studies use several visualization techniques to show the initial detection, exploration of the source of the attack, command and control centers, and lateral movement of cyber security threats. This process generates further data that can be used to improve the accuracy of the model as the analyst documents and categorizes the threat after Investigation. Attendees Will Learn: Latest developments in statistical modeling used for threat detection How deep learning can be used for better accuracy Complementing and improving the analyst workflow


FloCon 2021


https://flocon2021.sched.com/event/euPK/improving-analyst-workflow-using-behavior-based-host-detection-to-combat-spoofing

This presentation describes a scalable distributed system to identify hosts based on behavior rather than addresses. When hunting for particular threats or looking for anomalies in general, finding all the resources that could be a part of a malicious behavior can be challenging. Finding as many network flows as possible that tie with the threat can be very time consuming and prone to error depending on the sophistication of the attack. Even when an initial set of addresses has been discovered to be connected to a given threat, it can be difficult to track them across time since addresses can easily be spoofed. Tracking behavior can be more useful than tracking addresses since attack behavior is harder to modify than addresses, checksums, or email wording. We generate evolving statistical models per host, attribute all addresses seen from that host and automatically cluster hosts based on their statistical distance. The analyst can query the system with an address seen at a given timestamp to traceback the threat to its origin in time and location (geolocation or local subnet), other addresses it used, other hosts it may have potentially compromised and C2 IP addresses it communicated with etc. since all flow data is tied to a unique identifier rather than an IP or MAC address. Having a knowledge system that builds and keeps track of statistical models per-host in real-time not only can automate time-consuming parts of the analyst workflow, improve accuracy, reduce missed events and discover secondary threats but also proactively detect anomalies, improve damage assessment and find compromised devices in the network. Furthermore, with adequate amount of training data the system can be trained to proactively look for threats of known signatures and anomalies in real-time. This information can not only be used for threat hunting and anomaly detection but also assess risk to an enterprise to improve threat and risk modeling. Attendees Will Learn: ​​​​ The attendees will learn new developments in tracking host behavior. We will discuss the development of network and statistical models to create per-host models that will be used to uniquely identify hosts when their addresses are spoofed. We present the design decisions for the models and how they can be used to improve the analyst workflow in novel ways.

Recent Activities


CMU / SEI Webinar : Solving Current Cyber Challenges: Academic and Industry Collaboration


https://www.youtube.com/watch?v=ArNvggWUlK0

The chasm between what academia researches and what industry uses in cyber is wide. By building mutually beneficial collaborations between the two, we can improve algorithms, datasets and techniques that are applicable to the real-world. Students and researchers should build a solid partnership with professionals early in their career to be exposed to and ground their work in current industry challenges. This ultimately results in more research being transformed into practical solutions. Collaborations between the academia and the industry is one of the best ways for the industry to direct academic research outcomes to solve current problems. Without collaborations it can be challenging for the academia to produce algorithms, datasets and techniques that are directly applicable for real-world problems. Students and researchers have to build a working loop with the professionals early in their carrier to maximize the relevance of their work in practice, which ultimately results in more research being transformed to practical solutions. What Will Attendees Learn? • The need for platforms that bring industry and academia together to exchange problems and potential solutions capitalizing on what strengths each side can give. i.e. academia - dedicated research, industry - defining the problem space as a practitioner • Collaborations can give industry professionals novel paradigms and help them to approach their problems differently. • The importance of validation of academic work by the industry in addition to the peer reviewed publications.


Grant Proposals


InSight Advanced Performance Measurement System (NSF IRNC/ProNet/0963058 | PI: Greg S. Cole)

https://www.nsf.gov/awardsearch/showAward?AWD_ID=1450959&HistoricalAwards=false

The GLORIAD/InSight program is a global, open-source software development effort to research and experimentally deploy advanced flow-level network measurement technologies at various levels of the research and education (R&E) network eco-system. The tools developed will enable far-reaching research towards better understanding network utilization, identifying network application performance issues while carefully attending to differing community concerns and requirements regarding data privacy and security. Experimental deployments will showcase actionable analytics and visualizations for network operations, new methods and models of data sharing across the global R&E fabric, and thus a better understood, more performant fabric. Through a global, community-focused, open-source development effort, the project extends the current beta version of InSight - the flow-level passive measurement, analysis and visualization system in use on the GLORIAD network. The InSight tools are based on passive network measurement and monitoring by combining the rich detail of comprehensive, non-sampled, bi-directional, multi-model, multi-layer Argus flow-data with modern big-data analytic and visualization tools. A flexible stream-based method of enriching network flow metadata enables broader, customer-defined analytics. Working closely with interested large-network providers, the project works toward experimentally deploying InSight on links up to 100 Gbps.