BAD BASS: Phishing through Webview Injection
Introduction to Banking Malware
Zeus, SpyEye, and other banking malware of the pre-ransomware age have interested me since their discovery due to their ability to modify the web-content in browser processes through the clever utilization of API hooks. This allows these types of malware to not only capture credentials (which could be accomplished through a keylogger) but solicit additional information (such as social security numbers) that could prove useful for future fraudulent activity. Even beyond that, these trojans can potentially even obscure the evidence of their theft through rewriting the HTML content shown to the victim. The malware operator may have used the victim’s bank account to wire money to a mule, but the balance will appear as if nothing had changed in the browser.
Advances in Defense/Offense
With greater adoption of multi-factor authentication, EDRs, and better detection of process injection and hooking, the approach of those banking trojans has become less effective. That is not to say they cannot work, but they would need additional development in order to maintain stealth and operate in an account with MFA enabled.
Other approaches include using built-in functionality of browsers such as the Chromium-based remote debugger port CLI flag , malicious browser extensions , or simply decrypting the encrypted cookies on disk. Those come with their own pros and cons. With the former, no files need to be dropped to disk, and no memory injection is required. However, high-cardinality detections already exist that can alert on that type of attack since that behavior is highly unusual for most machines. With the second, no process injection is required either, but files have to be dropped to disk which may increase the risk of detection.
BAD BASS
What it (Does/Does Not) Do
BAD BASS is a proof-of-concept for an experimental approach to phishing through the use of webviews that have been injected into browser processes. It works by injecting a webview into Chromium-based browsers in order to phish for credentials and solicit additional information that could prove useful for its operator.
In its current form, it does not defeat MFA. A small amount of work would be required in order to integrate it with something like Evilginx2 that could make this possible. Dynamic, remote web-inject configuration would be required in order to make this work. A downside to MITM tools like Evilginx2 is the need of using a phishing domain that could alert a sufficiently aware user. Combining the injected webview with evilginx2 would allow a MFA-aware MITM attack without alerting the user through a suspicious domain the URL box.
Additionally, in its current form, there are several huge IOCs that will be discussed that could tip off a defender of its presence. More work will be required (that will also be discussed) in order to make this a viable technique.
Architecture Overview
The BAD BASS systems is composed of three distinct parts:
1) LIVEBAIT: Simple C++ based shellcode loader that targets browsers
2) WEBPHISH: Golang DLL
3) PHISHROD: Web-Injection configuration resource packer
Essentially, WEBPHISH is compiled down to shellcode (using Donut) which is then included into LIVEBAIT as a PE resource. PHISHROD then creates and packs (as a PE resource) an encrypted archive consisting of web-injects and a mapper file that maps targeted website titles to their specified web-inject.
When LIVEBAIT is run, it will continually discover browser processes that have yet to been injected with WEBPHISH. Once found, WEBPHISH will be injected and a remote execution thread will be created.
WEBPHISH will receive the web-inject configuration from LIVEBAIT through inter-process communication (IPC) channels, decrypting and unpacking it before watching the browser window’s title for one that has been targeted. Once a targeted title is discovered, a frameless window will be created that will host a spawned embedded webview. The webview will load the specific embedded web-content (HTML/CSS/JS) it received from LIVEBAIT that maps to the web-page title.
After collecting data from the user, the web-inject’s embedded javascript will make a call to transfer the captured content back to WEBPHISH. In turn, WEBPHISH will then communicate the captured data to LIVEBAIT over IPC. In this proof of concept, LIVEBAIT simply prints the data to the console and saves a file to the user desktop containing the content.
Internals Analysis
LIVEBAIT
Compiled as an EXE, LIVEBAIT can be ran as a standalone executable - or, more likely, injected/mapped into a remote process to hide its presence.
It serves three purposes:
1) Watcher process for targeted browsers
2) Wraps and injects the shellcode-compiled WEBPHISH payload
3) Multi-channel IPC protocol for configuration delivery and collection capture to hinder analysis
This separation of duties allows the loader to be modified without touching the code of the payload, requiring recompilation of the latter. Stealthier injection techniques or delivery of the captured data to an operator can be added in an update.
Inter-Process Communication: Events, Named Pipes, Mailslots, Oh my!
Using multiple channels of IPC, including Events, Named Pipes, and Mailslots increases the complexity of analysis. An analyst that wishes to gain a holistic understanding through dynamic analysis of the malware might need to run multiple debuggers, paying careful attention to multiple threads and processes, or else create custom IPC tools to act in the place of either LIVEBAIT or WEBPHISH to analyze the other.
There are two forms of IPC-channel based communications that take place between LIVEBAIT and WEBPHISH:
- Web-Inject configuration retrieval
- Mailslots
- Information Collection (phished information from the web-inject form)
- Events and Named Pipes
WEBPHISH
The Golang-based WEBPHISH payload is first compiled as a DLL. It is then converted into shellcode (using Donut) to be injected and ran in the remote browser process through a technique known as shellcode reflective DLL injection. Without getting too deep in the weeds, this technique allows us to inject an arbitrary DLL into a remote process without it being dropped to disk, allowing this portion of the malware to be memory-resident only, even if LIVEBAIT was launched as a standalone EXE from disk (which would not be recommended for opsec).
Upon starting, it will detect the type of browser it is living in (although only Chrome is supported for this POC) before initializing communications with LIVEBAIT to pull the web-inject configuration. Using mailslots, WEBPHISH will retrieve and decrypt from LIVEBAIT the RC4-encrypted ZIP archive containing the web-inject configuration.
Frameless Windows and Chrome Process Tree
Following that, it will attempt to target specific websites, spawning a frameless phishing window from within the browser process before embedding a webview into it. To understand this, we should first take a look at some interesting aspects of windows and Chromium-based browsers.
Despite having only one tab open, Chromium-based browsers spawn several child processes. That number increases with the number of tabs open. When Chrome starts, it actually creates 3 different classes of windows:
- Chrome_WidgetWin_0
- Chrome_WidgetWin_1
- Chrome_RenderWidgetHWND
Different classes of windows perform different behaviors, but in this case we’re only interested in the last two.
Multiple instances of Chrome_WidgetWin_1 can exist, one for each framed (green rectangle) Chrome window. Each Chrome_WidgetWin_1 has a window title corresponding to the current webpage’s title. Each also contains a child Chrome_RenderWidgetHWND that appears to render and host all the web-content, as shown below.
At any given time, the Chrome_WidgetWin_1 window has only one Chrome_RenderWidgetHWND - the active tab. All inactive tabs get moved be children under a Chrome_WidgetWin_0, which appears to be a container of sorts. If an inactive tab is selected by the user, it switches parents with the active child of the Chrome_WidgetWin_1.
For the purpose of WEBPHISH, it is then clear that we need to find the Chrome_WidgetWin_1 windows and monitor their titles. If the title is one that corresponds to a targeted webpage as specified in the web-inject configuration bundle, we need to create a window covering the active Chrome_RenderWidgetHWND with our phishing content. Our new child window then needs to follow the Chrome_RenderWidgetHWND, resizing with it, as well as minimizing/maximizing/closing with it as needed. A webview will then be created that will be embedded within this new window.
The last step is to hide the fact that we are actually embedding a browser within a browser. As a result, we don’t want the buttons to minimize/maximize/close, or the various browser tools and favorites bar to appear in our injected webview. We do this through setting various options while creating the window to make it frameless.
There are some considerable downsides to this approach of layering a new window on top of the existing Chrome_RenderWidgetHWND. Namely, there is some lag if the browser window is moved rapidly, exposing the real content beneath, and sometimes when closing the window the frameless phishing window will still appear for several seconds thereafter. Some of these downsides might be mitigated; for example, by actually hiding the render window so as to avoid its exposure while the phishing window is running and the browser is moved rapidly. Clearly, this approach isn’t perfect.
Resource Filter Callbacks
In order to make use of webviews in this experiment, I made use of an open-source library called [go-webview2](https://github.com/jchv/go-webview2) that handled the heavy-lifting. I want to make clear my appreciation to the author as wrapping the webview interfaces would have been a larger under-taking than I would have taken. The library ultimately allowed me to develop this POC without delving too much in the complexity of setting up and handling the webview lifecycle.
However, the repository did not have a good source of examples, and I found that I could not get several key features working from the library, namely web resource callbacks and retrieving the data posted from the webview.
To that end, I had to make some changes to fix the problem, and the quick-and-dirty solution involved me including a portion of the library's code in the **Webphish/internal/browser/browser.go** and **Webphish/internal/edge/*.go*** files, making adjustments as necessary.
The WebView interfaces (which the go-webview2 library essentially wraps) provided by Microsoft do not provide an easy way to load content from memory, except as a base64-encoded string. You are given two options:
Load from a URL (which could include a file:// prefix to get files from disk)
Load from a base64-encoded string
The first option would not work as we do not want to drop all of our web-injects to disk. That would be a dead give-away of what the malware is doing. The second option looked promising, but had trouble when trying to load large amounts of content (as is the case with complex web-pages).
I needed another option. That ended up coming in the form of resource filters. Basically, we can configure the webview such that upon it attempting to load a resource (such as a URL) matching a specific regex and data type or “resource context”, it will attempt to retrieve the content from a callback we set.
With this, we can format a hard-coded URL (http://contoso.com as the ResourcePath variable in the image above) with the path of the matching in-memory web-inject and Navigate to it with the webview (WebView->Navigate). Since we set our resource request filter to match “*contoso*”, that URL will get passed to our callback webResourceRequested in the image shown below. That call-back returns the web-inject back to the embedded webview to be loaded.
Using this method, we are able to load web-injects from memory, keeping them from being inspected on disk.
Handling Collection with Web Messages
The last thing WEBPHISH must do is handle the phished information inputted by the user into the form of the web-inject. We can do this through web messages. This was another feature that I had to do some work on to get to work, as I couldn’t seem to get it working through the library (maybe a bug?). In the image above showing the webResourceRequested function, there is another callback function named msgcb. That function calls our own callback method that initiates collection of the captured data through the Named Pipe communication method described in an earlier section of this post.
To send data from the embedded webview back to WEBPHISH, Javascript in the web-inject need only call a method to post data back, as shown below.
Finally, after sending the data to WEBPHISH, the phishing webview and its underlying frameless window will be closed, allowing the victim to interact with the webpage normally.
PHISHROD
PHISHROD acts as a configurator for LIVEBAIT. Web-injects are supplied in a typical file-tree structure, with one mapper file named config.xml always existing at the root.
The mapper file uses regex to match a given window title to the HTML file it should have the embedded webview load.
WEBPHISH supports the loading of different embedded assets such as images and style-sheet (css) files. However, to make creating a phishing page dead-simple (and because I suck at all things front-end), I used the SingleFile Chrome extension as seen above. This allowed me to copy the Chase.com home page as a single HTML file without cloning all of the site’s spidered assets, embedding required images as base64. Slight modifications to the page were added to create the malicious form seen in the WEBPHISH section of this post.
Once the phishing pages for the targeted websites are complete, there are two steps that PHISHROD needs to perform in order to get the final payload working.
1) Package the web-injects and mapper file into a RC4-encrypted ZIP file
2) Embed that file into the LIVEBAIT executable as a PE resource
The Portable Executable (PE) file format on Windows is a complex data structure, but to understand how this works you can think of an EXE or DLL file consisting of three parts:
- Metadata
- Executable code and variable storage
- Resources
As an example, a PE file for a video game might include resources such as its icon and character assets. In this case, we’re using the resource section of the LIVEBAIT PE file to store (1) WEBPHISH and the (2) RC4-encrypted ZIP file.
Once the embedding of the configuration is complete, the LIVEBAIT payload is ready for execution.
Weaknesses and Indicators of Compromise (IOCs)
There are numerous weaknesses and indicators of compromise that would render the BAD BASS system in its current state to be easily detected. I’ll detail several of those identified from most to least obvious, as well as further measures that could be added to increase stealth.
Process Tree: WebView Processes Under Victim Process
Even a novice analyst can see that something is amiss in the image shown above. In no normal scenario would an msedgewebview.exe process (an entirely different browser) be launched under Chrome.
This indicator of compromise could be mitigated by utilizing process parent id (PPID) spoofing the root webview process, separating the webview tree from the Chrome process.
High Entropy In LIVEBAIT Resources
The high entropy of the embedded resources are due to compression and encryption. The Donut shellcode generator is configured by default to encrypt the PE/script it wraps. The web-inject configuration is compressed as a ZIP file, and then RC4-encrypted (in the case of this POC, with a hard-coded key which would be considered another weakness).
In both cases, extra work could be used to encode the resources using a low-entropy source, such as the text from an English-language novel or an image (steganography).
Executable Memory Section Not Mapped to a File on Disk
In the case of this POC, the memory hosting the shellcode in the process is mapped into a section of memory with read-write-execute permissions (RWX). This is highly suspicious and does not occur often in the wild (perhaps with the exception of some JITted languages like Java or .NET-based). The memory could instead be allocated with read-write (RW) permissions, and then modified to read-execute (RX). However, this would still be highly-suspicious to advanced memory scanners such as EDRs, as they will be able to see that the RX memory is not mapped to a file on disk. You can see with the other RX sections that they are mapped to the EXE or DLLs.
This could be mitigated using the sleeping beacon or Gargoyle technique whereby the payload is triggers, performs some action, and then the section of memory in which it resides is marked RW. A timer is created just before the section encryption that triggers a return-oriented programming (ROP) stub (basically code reuse of the host program) that once set will return the payload to RX and execute. This method is not perfect, and tools like [pe-sieve can detect some implementations](https://github.com/hasherezade/pe-sieve/wiki/4.9.-Scan-threads-callstack-(threads) by inspecting the stacks of individual threads, looking for suspicious function return addresses that do not correspond to regions of memory mapped to actual files on disk. Thread-stack spoofing would therefore be required to decrease the risk of detection by memory-scanning techniques.
Susceptible to User-Mode Hooks in LIVEBAIT Injector
Finally, the shellcode injection technique used in this POC is trivial, using highly suspicious calls such as CreateRemoteThread that may cause a security product to investigate the calling process more deeply. EDRs and other Anti-Virus products will usually hook Windows APIs for every user-mode process on those functions that are most likely to be abused (such as CreateRemoteThread). Using the functions exposed by Kernel32 or ntdll through any normal means will pass execution through the security product.
This can be avoided by either unhooking those functions using a library like HalosUnhooker, refreshing the DLL in memory, or by using a variation of techniques to get the syscall numbers that those kernel32/ntdll functions ultimately wrap. As a result of PatchGuard, introduced with 64-bit Windows, security products can’t hook the kernel syscalls anymore and must rely on other means such as user-mode hooking. We can then call the native functions or use the syscalls retrieved through a technique known as direct syscalls to bypass some of the telemetry they monitor (other forms of telemetry still exist such as ETWTI that can be blinded with memory patching).
Next Steps
BAD BASS was just an experiment to see the viability of injecting webviews for the purpose of phishing. It does work, but I don’t have confidence that it’s much better than existing techniques. The only thing that I believe this technique could be useful for is in defeating MFA with the inclusion of Evilginx2. A dynamic web-inject resolver would be needed in WEBPHISH, perhaps by extending the configuration file format to point at a remote resource (and then extending WEBPHISH to retrieve it) which would not be overly difficult. This would not be impossible with the other techniques discussed such as API hooking and browser extensions, but would require significantly more work.
Credential theft can be accomplished better with other means that do not require process injection, with the most promising technique I’ve seen being malicious browser extensions. With that technique, you will only need to bypass file scanning. Once in memory, EDRs are blind to what the browser extension is doing as there are no weird memory allocations, and it becomes the responsibility of browser developers to ensure that extensions are not performing malicious activity. As of now, this seems to be a more vulnerable point until Google hardens it.