Commit 6c3a8bd8 authored by Martin Řepa's avatar Martin Řepa

Update documentation

parent 4a8a1f24
No preview for this file type
......@@ -6,14 +6,16 @@
%\usepackage{caption}
%\usepackage{subcaption}
% \usepackage{fancyhdr}
% \usepackage{todonotes}
% \usepackage{amssymb}
% \usetikzlibrary{trees}
\usepackage{todonotes}
\usepackage{multicol}
\usepackage{multirow}
\usepackage{amsmath}
\usepackage{titlepic}
\usepackage{graphicx}
\usepackage{tabularx}
\usepackage{array}
\usepackage{pdflscape}
\usepackage{adjustbox}
......@@ -29,7 +31,7 @@
\clubpenalty=10000
\widowpenalty=10000
% Setting behaviour and looks of links (TODO: maybe not needed)
% Setting behaviour and looks of links
\hypersetup{
colorlinks,
citecolor=black,
......@@ -44,9 +46,7 @@
\newpage
\section{Introduction}
\subsection{Domain Name System (DNS)}
Domain Name System (shortly DNS) provides way to match human readable hostname
to its related IP address. The reason it exists is so people can remember
user-friendly hostname instead of hardly remembred numeric IP addresses.
......@@ -78,7 +78,6 @@ reaches the authoritative server for desired hostname "example.test.com" and
obtains response with the IP address and some additional information.
\subsubsection{Packet structure}
The DNS messages are encapsulated over UDP or TCP using port 53. The same
message format is used for all exchanges between client and servers.
......@@ -98,7 +97,6 @@ message format is used for all exchanges between client and servers.
\end{table}
\subsubsection{Exfiltrating data}
Unfortunately DNS is not only used for purpose it was made for. Imagine a
situation when malicious attacker somehow got access to a foreign server and
compromised private data. DNS might help him to stealthily smuggle data out to
......@@ -116,16 +114,126 @@ almost never blocked by firewall and attacker is able to setup a "dns tunnel".
Also DNS traffic is rarely monitored so it might be too late when the data leak
is discovered.
\subsection{Why to deal with this (?)}
\subsection{Current approach}
\subsection{Motivation}
Security companies often offer a way to detect anomalies in network traffic such
as for example exfiltrating data via dns requests. Question comes up what to do after
the detection. Generally the most common approach is to block the IP address,
however this way the attacker might change his IP address, which means our
detection effort is lost. Also false positives are quite expensive, thus it's
hard to find proper blocking threshold so benign users are not restricted.
The goal of this project is then to propose a model (with regards all possible
deffender's actions, such as ratelimiting, blocking or redirecting to
honeypots) solving this issue and implement initial skeleton to get first
results.
\newpage
\section{Solution}
\subsection{Game theoretic model}
\subsection{False positives}
\subsection{todos...}
\subsection{results}
I chose a game theoretical approach to solve the issue. Firstly I try to model
a game considering only blocking as a deffender's action which I would later
expand to include also ratelimiting and redirecting actions.
To find anomaly dns request we need to properly split a space of attacker's
actions into regions classifing the requests. Instead of choosing a linear
separators or ploting some kinds of spheres I choose more flexible approach - neural
networks which are able to approximate any continuous functions (Universal
approximation theorem).
See examples in $\mathbb{R}^2$.
\begin{figure}[ht]
\begin{center}
\includegraphics[width=0.3\textwidth]{img/linear_regions.png}
\includegraphics[width=0.3\textwidth]{img/sphere_regions.png} \\
\end{center}
\begin{center}
\includegraphics[width=0.70\textwidth]{img/network_regions.png}
\end{center}
\end{figure}
Every neural neetwork classifies request either as malicious or benign and
determines probability to block the request in case of classified as malicious.
I will search for Nash Equilibrium. Nash equilibrium is defined as \todo{definition}.
It can be proven that Nash Equilibrium must exist if we allow mixed strategies (Nash's Existence Theorem).
Mixed strategy is defined as... \todo{defitiniton}.
Every deffender's mixed strategy of neural networks defined as above can be without
loss of generality transformd to mixed strategy of neural networks which determine
the probability $1$ to block the request if it's classified as malicious.
Let's see an example, where $NN_p$ represents neural network with blocking
probability $p$:
\begin{center}
\begin{equation*}
\begin{aligned}[c]
1.NN_{0.6} \text{ played with prob. } 0.1 \\
2.NN_{0.7} \text{ played with prob. } 0.6 \\
3.NN_{0.2} \text{ played with prob. } 0.3 \\
\end{aligned}
\qquad \leftrightarrow \qquad
\begin{aligned}[c]
1.NN_{1} &\text{ played with prob. } 0.1 \cdot 0.6=0.06 \\
2.NN_{1} &\text{ played with prob. } 0.6 \cdot 0.7=0.42 \\
3.NN_{1} &\text{ played with prob. } 0.3 \cdot 0.2=0.06 \\
4.NN_{1} &\text{ classifing all requests as benign played with } \\
&\text{ prob } 1-(0.06+0.42+0.06)
\end{aligned}
\end{equation*}
\end{center}
It means that if I include neural network classifing all request as benign in my
future model I might only consider neural networks with blocking probability
equals one to ease solving the game.
\subsection{Initial model}
The game is modeled as a zero sum normal form game $G$, where $G = \{N, A, u\}$:
\begin{itemize}
\item $N = \{\text{attacker}, \text{deffender}\}$ is set of players
\item $A = A_{attacker} \times A_{deffender}$ is an action profile
\begin{itemize}
\item $A_{attacker} = \{x | x \in \mathbb{R}^n\}$ is a set of attacker's
actions, where $n$ is number of features used (see below)
\item $A_{deffender}$ is set of neural networks, which are able to classify
each attacker's action as either malicious (block) or benign (do nothing)
\end{itemize}
\item $u$ is an Utility function or Payoff function $f: A_{attacker} \times
A_{deffender} \mapsto \mathbb{R}$
It's possible to experiment with definition of this function, but I have used
the following one assuming the attacker wants to maximize every feature
\begin{equation}
f(a_a, a_d) = \begin{cases}
\prod_{i=1}^{n} a_{ai} & \text{if $a_d$ classifies $a_a$ as benign}\\
0 & \text{if $a_d$ classifies $a_a$ as malicious}
\end{cases}
\end{equation}
\end{itemize}
Deffender's optimal strategy for the game defined as this would be neural network
blocking every request, which is definitely wrong so I need to include false
positive constraint to prevent this. Every deffender's strategy must sattisfy
constraint $FP_{rate} \le FP_{constraint}$. Exact definition and calculation see
below \todo{link to it}.
\subsection{Solving the game}
To solve ... Double oracle \todo{add definition}
\subsubsection{Features}
\subsubsection{Data}
Speaking about dataset \todo{dataset}
\subsection{Implementation}
My implementation is written in Python3.6 and might be seen and tested at my
school gitlab repository for bachelor thesis in \textit{research\_project} branch.
See \url{https://gitlab.fel.cvut.cz/repamart/bachelor-thesis}.
\subsection{Results}
todo
Other result using for example syntetic data might be found in my reference
implementation in \textit{results} directory.
\newpage
\section{Conclusion}
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment