Record size of incoming and outgoing traffic
A bit of an epic at this stage, will be broken down into sub-issues shortly:
As part of a new feature, we need to record the size of the incoming and outgoing traffic. Currently, we're using mitmproxy to capture traffic and generate pre-defined incidents from outgoing traffic to known thirdparties. (1) (How) can we calculate the data size of each conversation that is already recorded from the outgoing traffic and add it to the incidents? (see interceptor/run_analyzer.py) (2) How to deal with incoming traffic? The incoming traffic is currently not recorded as this has not been relevant so far and would probably lead to a huge growth of log-size. How can we determine the size of conversations and link them to the sender without exploding our logs? For our incidents, we build incident objects that hold information like the flow, receiver_url, sni etc. (https://git.app-check.org/app-check/interceptor/-/blob/master/run_analyzer.py#L350) It would be ideal to have an analogous structure for incoming thirdparty conversations that hold the flow, receiver_url, sni, data_size. Can we extract the required information from the header? ...currently, the response content is thrown away due to log-size concerns - https://git.app-check.org/app-check/interceptor/-/blob/master/mitmproxy/interceptor.py#L61
our traffic_analyzer.log contains info such as:
GET https://99.84.156.23/sdk-getSkuGUIDsByType/c2db553c59168d1b476dbaafbb7feec7
<< 200 OK 670b
We've seen 56 flows 192.168.42.15:41306: GET https://99.84.156.23/sdk-getSkuGUIDsByType/c2db553c59168d1b476dbaafbb7feec7
<< 200 OK 670b We've seen 57 flows 192.168.42.15:41318: GET https://99.84.156.23/sdk-getSkuGUIDsByType/c3042f730b1bcf547e5f0d7fa16fdf37 << 200 OK 76b
=> do 670b, 76b already represent the size and where are they coming from?
In https://git.app-check.org/app-check/interceptor/-/blob/master/extractors/HttpExtractor.py#L118 we're already getting the content-length. Can this be used for our purpose and if so how?