The following article discusses front-end security techniques with an aim to educate front-end developers of the risks of front-end man-in-the-middle attacks in the modern web, and direction of how to prevent them.
Man in the Middle attacks (MITM) are a type of orchestrated hack where a malicious actor sits in between a secure session between 2 parties, typically a client and a server. Historically, this was a much larger problem than it is today because of fundamental technology changes, most notably HTTPS.
A classic example historically of such an attack, would be through unsecured wifi networks, typically in public places like coffee shops. Unsecured wifi means the packets of data being sent over radio between the client’s machine and the router are not secured. This meant that a malicious actor could use software on a laptop to record the traffic, and using easily accessible tools like Wireshark to inspect those packets of data. It used to be a trivial process to read people’s facebook, instagram, email or historically, financial data, or even steal applications sessions to highjack the user's accounts. Needless to say, these data contained value which could be sold on black markets, or aid crimes such as stalking, harassment and blackmail. It would be trivial to obtain card details posted online.
Sometimes these attacks using unsecure wifi went further, and could run as a public network that your device would automatically connect to in public places such as underground stations. When your device requested updates from servers, i.e. latest notifications, instant messages, emails, social network feeds, these would be gathered en-mass without knowledge of the victims or even requiring them to unlock their devices and connect to a network - provided they had previously connected to the hacker’s network of the same SSID.
Today these problems have been largely mitigated. Nowadays it’s normal for HTTPS to secure the traffic between the browser and the server, meaning if the that if your device is connected to an unsecure wifi, your data is sent through a channel using encryption so strong that if is theoretically impossible for the hacker to decrypt the traffic and make use of it. This is because the computing power to obtain the “keys” to decrypt would be out of the resources and price range to make it worthwhile before the data is of no value.
It’s worth noting not all client-server communication today is secured by HTTPS, despite the best efforts of internet giants like Google throwing big red banners to the users of Chrome when accessing unsecured sites. Not to mention changes in various laws requiring specific levels of security through certain levels of SSL certificates, mandating secure storage of data, financial service regulations to name a few. Today the common man-in-the-middle attacks are largely a thing of the past. However, they’re not gone.
As time progresses, so does technology. If you’re a software developer working with front-end technologies - the stuff that goes on in your browser rather than on a server - you’d testify to just how fast this moves. Just 10 years ago frameworks and front-end technologies which are defacto listed on job specifications didn’t exist. Tech houses commonly recruiting for developers with 5 years ReactJS experience for a framework which was released 8 years ago is a testament to that.
With this switch from server-side application logic moving to the front-end as a result of this, we’re finding new avenues appearing for Man-in-the-middle attacks to appear in unexpected ways. I discovered one such example when curiosity got the best of me when I discovered the 2021 Matrix film’s viral website. The website employed a brilliant feature that displayed unique tiny clips from the upcoming trailer each minute of the day, meaning the fans were flocking to the website to view the site’s video over and over in order to attempt to discover new information about the long-anticipated movie. Fans then flocked to social sites to discuss their findings.
I immediately wanted to understand how the developers had built this. There is an architectural challenge in building a secure, scalable solution that is able to provide different video content by the minute, and render that to the browser. Essentially, whatever minute in the day it is, the potential for this to be a different video needs to be catered for. The browser needs to obtain this from the server without letting the user know what the end result is. However, the code determining this is easily readable to any user with some technical know-how.
The way the developers behind the website achieved this is as follows:
- The browser loads the webpage, the Javascript code is executed in the browser
- The current minute is determined by the browser, which refers to the system’s clock
- The time is put into an complicated obfuscation algorithm - a way of turning the time (i.e. 12:03) into something difficult to understand which mathematically represents the same value (i.e. asjfyeof23b3h30f, which is seemingly random)
- The value is to sent to the server a requested filename of a video file
- The resulting video, if found, is streamed to the browser
There’s a few issues with this approach.
Firstly, the time is ultimately determined by the user’s system clock, i.e. their iPhone’s clock which can be changed at any time by the user. This means if you wanted to view a specific time’s video, you can simply change your system’s clock and refresh the webpage.
Secondly, the value which is being sent to the server is not encrypted, it is obfuscated. Without going into nitty detail, what this means is you don’t need the password to determine the original value or to generate the end hash (the seemingly random bunch of letters and numbers). This means it’s entirely possible, as I will demonstrate, to generate the obfuscated output without needing to bypass/crack or otherwise break any encryption. Obfuscation is not security.
So we return to the man in the middle attacks. If we wanted to generate the same obfuscated strings so we can create a full catalogue of all available times, we simply would need to just put all the possible times into the same obfuscation function that the code is already using, then record the outputs. This is easier said than done. The Javascript on the page is compiled meaning the original code isn’t easily readable. Specific code needs to be executed in a specific order to get to the same application state that then generates the obfuscated codes. This is a hard challenge which could take weeks to pull off.
Taking a step back, what we want to do is create an application state identical to the one running in the browser, and inject our own logic into that state at a time of our convenience. Ideally, we just want to put a little bit of code before the time is obfuscated, then record the results after. For this specific example a simple function is called which gets a result. Something like the following (this isn’t the actual code but an example of the problem):
obfuscatedCode = getObfuscatedCode(time);
filename = “https://example.com/”+obfuscatedCode+”.mp4”
return filename;
If we could do something like this instead:
times.forEach(time => {
obfuscatedCode = getObfuscatedCode(time);
filename = “https://example.com/”+obfuscatedCode+”.mp4”
console.log(time+” => “ + filename);
})
This would break our browser’s running application, but would also write every possible time and its corresponding video URL to the browser's console for us to browse at our leisure.
But how is this achievable? It’s technically too time consuming to try and recreate this “getObfuscatedCode” function outside of the running code in the application. We can’t just add this code into the browser’s running instance of the application. We can’t modify the server’s distributed code because it is secured by HTTPS - or can we?
Although HTTPS prevents man-in-the-middle attacks, it does this through what is known as “trust authorities”. The technicalities of this is a conversation for another day. What you need to know is only known trust authorities in your browser are allowed to decrypt traffic before it reaches your browser. This is why unsecured wifi networks don’t pose much of a threat to your banking app. What most people don’t know, is it’s trivial to setup your own trust authority, and add that into your browser. This does require physical device access and authorisation - it can’t be achieved through wifi snooping, or compromising the remote server, for example.
In our scenario, we’re going to use off the shelf proxy software to man-in-the-middle attack our own session. Charles Proxy, a common and popular proxy for these purposes allows us to install a trusted authority to our browser, then pause and modify network request and responses on the fly. When the browser request the code for the Javascript application which generates the obfuscation codes, we will pause the response from the server, modify the javascript code, then once we’re happy, we let the modified response return to the browser which is then executed. After the code is executed, all our times and videos are logged and we can choose which one to view - bypassing the restrictions the developers intended - that you’d have to wait until a given time to view the specific video.
It could be possible to take this a step further, run a simple script to download all the videos then run then through a speech-to-text program/api, then run a diff tool against all the files to determine which ones specifically contain different content, thus saving us the time of manually browsers 1,440 different video files to find the spoilers we so desperately want to uncover.
Running a man-in-the-middle attack on your own HTTPS secured session is actually very easy to do by following a 5 minute youtube tutorial. It’s something I’ve done often when debugging/pentesting applications I have been writing for the web - it allows you to identify security problems in the same methods that potential malicious actors could use, thus allows us to solve security problems before they become real world issues.
Why is this important? As developers continue to employ new technologies into the front-end we need to be aware of security issues we’re introducing which affect our clients or business model. Whereas the example used here if widely exploited prior to the full release of the trailer (which at the time of writing is now officially released), this could potentially have had a small impact on the marketing output should the contents of those unique video clips been widely disclosed, should this security issue have been employed in say, a 2-factor authentication token for a banking app, we’d have a much worse and potentially very costly problem on our hands.
Furthermore is the issue of trust authority hacks on targeted individuals. Be it persons of political or financial interest such as diplomats, political activists or rich company directors, it has been widely known such persons have been targeted by such attacks to exploit them for various gains.
Whereas software engineers can't solve hardware targeted hacks, we do have a responsibility in understanding security flows in our front-end applications which could have impact on our business models such as the scenario discussed. How are we implementing resource restrictions? Are our feature flags settable in our application code, and what would be the impact if incomplete features were made known to our wider customer base? Are our application permission exploitable if a client could modify our front-end code variables? If we answer yes to any of these, we need to revisit that code in light of today’s still present threat of the age old enemy - the man-in-the-middle-attack.
Another consideration is should our given security logic live in our frontend codebase? In the example of the Matrix trailer site, the mistake made was using the frontend available to the user to generate what is essentially a security token to access a restricted resource. This is an arcitecture security issue. Should the developers have identified this, and refactored their code base to instead deploy an API endpoint which simple returned the video URL for the given time according to the server's own clock, it would have been impossible for a user to determine any of the video resources without the server returning them at the given clock time.
Often this is the answer - security access needs to be handled by applications servers, not application clients. Otherwise we run the risk of such OWASP issues as privilege escalation, which even today is one of the most severe security issues affecting modern web applications.
In conclusion, don't presume your front-end code is secure because the transport layers (HTTPS) offer security. If the client is determining security access, then any user is able to redetermine security access. The man-in-the-middle is very much alive.