Martina Lindorfer

Associate Prof. Dipl.-Ing.in Dr.in techn. / BSc

Martina Lindorfer

I joined the Security and Privacy Research group in 2018 as an Assistant Professor. My research interests are applied security, with a special focus on mobile security and privacy.

While this page is work in progress, please refer to my personal website for up to date news, publications, and research artifacts: https://martina.lindorfer.in/

I am always looking for motivated students, if you are interested in working with me have a look at our open positions and thesis opportunities!

Roles
  • Associate Professor
Projects (at TU Wien)
Publications (created while at TU Wien)
    2023
    • Back-to-the-Future Whois: An IP Address Attribution Service for Working with Historic Datasets
      Streibelt, F., Lindorfer, M., Gürses, S., Hernández Gañán, C., & Fiebig, T. (2023). Back-to-the-Future Whois: An IP Address Attribution Service for Working with Historic Datasets. In Passive and Active Measurement : 24th International Conference, PAM 2023, Virtual Event, March 21–23, 2023, Proceedings (pp. 209–226). Springer.
      DOI: 10.1007/978-3-031-28486-1_10 Metadata
      Abstract
      Researchers and practitioners often face the issue of having to attribute an IP address to an organization. For current data this is comparably easy, using services like whois or other databases. Similarly, for historic data, several entities like the RIPE NCC provide websites that provide access to historic records. For large-scale network measurement work, though, researchers often have to attribute millions of addresses. For current data, Team Cymru provides a bulk whois service which allows bulk address attribution. However, at the time of writing, there is no service available that allows historic bulk attribution of IP addresses. Hence, in this paper, we introduce and evaluate our ‘Back-to-the-Future whois’ service, allowing historic bulk attribution of IP addresses on a daily granularity based on CAIDA Routeviews aggregates. We provide this service to the community for free, and also share our implementation so researchers can run instances themselves.
    • Not Your Average App: A Large-scale Privacy Analysis of Android Browsers
      Pradeep, A., Feal, Á., Gamba, J., Rao, A., Lindorfer, M., Vallina-Rodriguez, N., & Choffnes, D. (2023). Not Your Average App: A Large-scale Privacy Analysis of Android Browsers. In M. L. Mazurek & M. Sherr (Eds.), Proceedings on Privacy Enhancing Technologies Symposium 2023 (pp. 29–46).
      DOI: 10.56553/popets-2023-0003 Metadata
      Abstract
      The privacy-related behavior of mobile browsers has remained widely unexplored by the research community. In fact, as opposed to regular Android apps, mobile browsers may present contradicting privacy behaviors. On the one hand, they can have access to (and can expose) a unique combination of sensitive user data, from users’ browsing history to permission-protected personally identifiable information (PII) such as unique identifiers and geolocation. On the other hand, they are in a unique position to protect users’ privacy by limiting data sharing with other parties by implementing ad- blocking features. In this paper, we perform a comparative and empirical analysis on how hundreds of Android web browsers protect or expose user data during browsing sessions. To this end, we collect the largest dataset of Android browsers to date, from the Google Play Store and four Chinese app stores. Then, we develop a novel analysis pipeline that combines static and dynamic analysis methods to find a wide range of privacy-enhancing (e.g., ad-blocking) and privacy-harming behaviors (e.g., sending browsing histories to third parties, not validating TLS certificates, and exposing PII—including non-resettable identifiers—to third parties) across browsers. We find that various popular apps on both Google Play and Chinese stores have these privacy-harming behaviors, including apps that claim to be privacy-enhancing in their descriptions. Overall, our study not only provides new insights into important yet overlooked considerations for browsers’ adoption and transparency, but also that automatic app analysis systems (e.g., sandboxes) need context-specific analysis to reveal such privacy behaviors.
    • Of Ahead Time: Evaluating Disassembly of Android Apps Compiled to Binary OATs Through the ART
      Bleier, J., & Lindorfer, M. (2023). Of Ahead Time: Evaluating Disassembly of Android Apps Compiled to Binary OATs Through the ART. In J. Polakis & E. van der Kouwe (Eds.), EUROSEC ’23: Proceedings of the 16th European Workshop on System Security (pp. 21–29).
      DOI: 10.1145/3578357.3591219 Metadata
      Abstract
      The Android operating system has evolved significantly since its initial release in 2008. Most importantly, in a continuing effort to increase the run-time performance of mobile applications (apps) and to reduce resource requirements, the way code is executed has transformed from being bytecode-based to a binary-based approach: Apps are still mainly distributed as Dalvik bytecode, but the Android Runtime (ART) uses an optimizing compiler to create binary code ahead-of-time (AOT), just-in-time (JIT), or as a combination of both. These changes in the build pipeline, including increasing obfuscation and optimization of the Dalvik bytecode, invalidate assumptions of bytecode-based static code analysis approaches through identifier renaming and code shrinking. Furthermore, customized apps can be distributed pre-compiled with devices’ firmware, sidestepping the bytecode altogether. Finally, Android apps have always relied on native binary code libraries for performance-critical tasks. We propose to narrow the gap between bytecode and binary code by leveraging the ART compiler’s capability to create well-formed ELF binaries, called OATs, as the basis for further static code analysis. To this end, we created a pipeline to automatically and efficiently compile APKs to OATs into a benchmark dataset of 1,339 apps. We then evaluate five popular disassemblers based on how well they can analyze these OATs based on how well they can detect function boundaries. Our results, in particular, compared to the success rate of two bytecode-based analyzers, demonstrate that our OAT-based approach can help to bring a wider set of code analysis tools and techniques to the area of Android app analysis.
    • Mixed Signals: Analyzing Software Attribution Challenges in the Android Ecosystem
      Hageman, K., Feal, A., Gamba, J., Girish, A., Bleier, J., Lindorfer, M., Tapiador, J., & Vallina-Rodriguez, N. (2023). Mixed Signals: Analyzing Software Attribution Challenges in the Android Ecosystem. IEEE Transactions on Software Engineering, 49(4), 2964–2979.
      DOI: 10.34726/5296 Metadata
      Abstract
      The ability to identify the author responsible for a given software object is critical for many research studies and for enhancing software transparency and accountability. However, as opposed to other application markets like Apple's iOS App Store, attribution in the Android ecosystem is known to be hard. Prior research has leveraged market metadata and signing certificates to identify software authors without questioning the validity and accuracy of these attribution signals. However, Android application (app) authors can, either intentionally or by mistake, hide their true identity due to: (1) the lack of policy enforcement by markets to ensure the accuracy and correctness of the information disclosed by developers in their market profiles during the app release process, and (2) the use of self-signed certificates for signing apps instead of certificates issued by trusted CAs. In this paper, we perform the first empirical analysis of the availability, volatility and overall aptness of publicly available market and app metadata for author attribution in Android markets. To that end, we analyze a dataset of over 2.5 million market entries and apps extracted from five Android markets for over two years. Our results show that widely used attribution signals are often missing from market profiles and that they change over time. We also invalidate the general belief about the validity of signing certificates for author attribution. For instance, we find that apps from different authors share signing certificates due to the proliferation of app building frameworks and software factories. Finally, we introduce the concept of an attribution graph and we apply it to evaluate the validity of existing attribution signals on the Google Play Store. Our results confirm that the lack of control over publicly available signals can confuse automatic attribution processes.
    • IoTFlow: Inferring IoT Device Behavior at Scale through Static Mobile Companion App Analysis
      Schmidt, D., Tagliaro, C., Borgolte, K., & Lindorfer, M. (2023). IoTFlow: Inferring IoT Device Behavior at Scale through Static Mobile Companion App Analysis. In CCS ’23: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (pp. 681–695). Association for Computing Machinery.
      DOI: 10.1145/3576915.3623211 Metadata
      Abstract
      The number of “smart” devices, that is, devices making up the Internet of Things (IoT), is steadily growing. They suffer from vulnerabilities just as other software and hardware. Automated analysis techniques can detect and address weaknesses before attackers can misuse them. Applying existing techniques or developing new approaches that are sufficiently general is challenging though. Contrary to other platforms, the IoT ecosystem features various software and hardware architectures. We introduce IoTFlow, a new static analysis approach for IoT devices that leverages their mobile companion apps to address the diversity and scalability challenges. IoTFlow combines Value Set Analysis (VSA) with more general data-flow analysis to automatically reconstruct and derive how companion apps communicate with IoT devices and remote cloud-based backends, what data they receive or send, and with whom they share it. To foster future work and reproducibility, our IoTFlow implementation is open source. We analyze 9,889 manually verified companion apps with IoTFlow to understand and characterize the current state of security and privacy in the IoT ecosystem, which also demonstrates the utility of IoTFlow. We compare how these IoT apps differ from 947 popular general-purpose apps in their local network commu- nication, the protocols they use, and who they communicate with. Moreover, we investigate how the results of IoTFlow compare to dynamic analysis, with manual and automated interaction, of 13 IoT devices when paired and used with their companion apps. Overall, utilizing IoTFlow, we discover various IoT security and privacy issues, such as abandoned domains, hard-coded credentials, expired certificates, and sensitive personal information being shared.
    • Heads in the Clouds? Measuring Universities’ Migration to Public Clouds: Implications for Privacy & Academic Freedom
      Fiebig, T., Gürses, S., Hernández Gañán, C., Kotkamp, E., Kuipers, F., Lindorfer, M., Prisse, M., & Sari, T. (2023). Heads in the Clouds? Measuring Universities’ Migration to Public Clouds: Implications for Privacy & Academic Freedom. In M. L. Mazurek & M. Sherr (Eds.), Proceedings on Privacy Enhancing Technologies (pp. 117–150). De Gruyter Open / Sciendo.
      DOI: 10.56553/popets-2023-0044 Metadata
      Abstract
      With the emergence of remote education and work in universities due to COVID-19, the 'zoomification' of higher education, i.e., the migration of universities to the clouds, reached the public discourse. Ongoing discussions reason about how this shift will take control over students' data away from universities, and may ultimately harm the privacy of researchers and students alike. However, there has been no comprehensive measurement of universities' use of public clouds and reliance on Software-as-a-Service offerings to assess how far this migration has already progressed. We perform a longitudinal study of the migration to public clouds among universities in the U.S. and Europe, as well as institutions listed in the Times Higher Education (THE) Top100 between January 2015 and October 2022. We find that cloud adoption differs between countries, with one cluster (Germany, France, Austria, Switzerland) showing a limited move to clouds, while the other (U.S., U.K., the Netherlands, THE Top100) frequently outsources universities' core functions and services---starting long before the COVID-19 pandemic. We attribute this clustering to several socio-economic factors in the respective countries, including the general culture of higher education and the administrative paradigm taken towards running universities. We then analyze and interpret our results, finding that the implications reach beyond individuals' privacy towards questions of academic independence and integrity.
    • Connecting the .dotfiles: Checked-In Secret Exposure with Extra (Lateral Movement) Steps
      Jungwirth, G., Saha, A., Schröder, M., Fiebig, T., Lindorfer, M., & Cito, J. (2023). Connecting the .dotfiles: Checked-In Secret Exposure with Extra (Lateral Movement) Steps. In IEEE/ACM 20th International Conference on Mining Software Repositories (MSR) (pp. 322–333).
      DOI: 10.1109/MSR59073.2023.00051 Metadata
      Abstract
      Personal software configurations, known as dotfiles, are increasingly being shared in public repositories. To understand the security and privacy implications of this phenomenon, we conducted a large-scale analysis of dotfiles repositories on GitHub. Furthermore, we surveyed repository owners to understand their motivations for sharing dotfiles, and their awareness of the security implications. Our mixed-method approach consisted of two parts: (1) We mined 124,230 public dotfiles repositories and inductively searched them for security and privacy flaws. (2) We then conducted a survey of repository owners (n=1,650) to disclose our findings and learn more about the problems and implications. We found that 73.6 % of repositories leak potentially sensitive information, most commonly email addresses (of which we found 1.2 million), but also RSA private keys, API keys, installed software versions, browsing history, and even mail client inboxes. In addition, we found that sharing is mainly ideological (an end in itself) and to show off ("ricing"), in addition to easing machine setup. Most users are confident about the contents of their files and claim to understand the security implications. In response to our disclosures, a small minority (2.2%) will make their repositories private or delete them, but the majority of respondents will continue sharing their dotfiles after taking appropriate actions. Dotfiles repositories are a great tool for developers to share knowledge and communicate - if done correctly. We provide recommendations for users and platforms to make them more secure. Specifically, tools should be used to manage dotfiles. In addition, platforms should work on more sophisticated tests, to find weaknesses automatically and inform the users or control the damage.
    • I Still Know What You Watched Last Sunday: Privacy of the HbbTV Protocol in the European Smart TV Landscape
      Tagliaro, C., Hahn, F., Sepe, R., Aceti, A., & Lindorfer, M. (2023). I Still Know What You Watched Last Sunday: Privacy of the HbbTV Protocol in the European Smart TV Landscape. In Proceedings Network and Distributed System Security (NDSS) Symposium 2023. 30th Annual Network and Distributed System Security Symposium (NDSS) 2023, San Diego, United States of America (the).
      DOI: 10.14722/ndss.2023.24102 Metadata
      Abstract
      The ever-increasing popularity of Smart TVs and support for the Hybrid Broadcast Broadband TV (HbbTV) standard allow broadcasters to enrich content offered to users via the standard broadcast signal with Internet-delivered apps, e.g., ranging from quizzes during a TV show to targeted advertisement. HbbTV works using standard web technologies as transparent overlays over a TV channel. Despite the number of HbbTV-enabled devices rapidly growing, studies on the protocol’s security and privacy aspects are scarce, and no standard protective measure is in place. We fill this gap by investigating the current state of HbbTV in the European landscape and assessing its implications for users’ privacy. We shift the focus from the Smart TV’s firmware and app security, already studied in-depth in related work, to the content transmission protocol itself. Contrary to traditional “linear TV” signals, HbbTV allows for bi-directional communication: in addition to receiving TV content, it also allows for transmitting data back to the broadcaster. We describe techniques broadcasters use to measure users’ (viewing) preferences and show how the protocol’s implementation can cause severe privacy risks by studying its deployment by 36 TV channels in five European countries (Italy, Germany, France, Austria, and Finland). We also survey users’ awareness of Smart TV and HbbTV-related risks. Our results show little understanding of the possible threats users are exposed to. Finally, we present a denylist-based mechanism to ensure a safe experience for users when watching TV and to reduce the privacy issues that HbbTV may pose.
    2022
    • Position Paper: Escaping Academic Cloudification to Preserve Academic Freedom
      Fiebig, T., Gürses, S., & Lindorfer, M. (2022). Position Paper: Escaping Academic Cloudification to Preserve Academic Freedom. Privacy Studies Journal, 51–68.
      DOI: 10.7146/psj.vi.132713 Metadata
      Abstract
      Especially since the onset of the COVID-19 pandemic, the use of cloud-based tools and solutions - lead by the ‘Zoomification’ of education, has picked up attention in the EdTech and privacy communities. In this paper, we take a look at the progressing use of cloud-based educational tools, often controlled by only a handful of major corporations. We analyse how this ‘cloudification’ impacts academics’ and students’ privacy and how it influences the handling of privacy by universities and higher education institutions. Furthermore, we take a critical perspective on how this cloudification may not only threaten users’ privacy, but ultimately may also compromise core values like academic freedom: the dependency relationships between universities and corporations could impact curricula, while also threatening what research can be conducted. Finally, we take a perspective on universities’ cloudification in different western regions to identify policy mechanisms and recommendations that can enable universities to preserve their academic independence, without compromising on digitalization and functionality.
    • A Comparative Analysis of Certificate Pinning in Android & iOS
      Pradeep, A., Paracha, M. T., Bhowmick, P., Davanian, A., Razaghpanah, A., Chung, T., Lindorfer, M., Vallina-Rodriguez, N., Levin, D., & Choffnes, D. (2022). A Comparative Analysis of Certificate Pinning in Android & iOS. In Proceedings of the 22nd ACM Internet Measurement Conference (pp. 605–618). ACM.
      DOI: 10.34726/3505 Metadata
      Abstract
      TLS certificate pinning is a security mechanism used by applications (apps) to protect their network traffic against malicious certificate authorities (CAs), in-path monitoring, and other methods of TLS tampering. Pinning can provide enhanced security to defend against malicious third-party access to sensitive data in transit (e.g., to protect sensitive banking and health care information), but can also hide an app’s personal data collection from users and auditors. Prior studies found pinning was rarely used in the Android ecosystem, except in high-profile, security-sensitive apps; and, little is known about its usage on iOS and across mobile platforms. In this paper, we thoroughly investigate the use of certificate pinning on Android and iOS. We collect 5,079 unique apps from the two official app stores: 575 common apps, 1,000 popular apps each, and 1,000 randomly selected apps each. We develop novel, cross-platform, static and dynamic analysis techniques to detect the usage of certificate pinning. Thus, our study offers a more comprehensive understanding of certificate pinning than previous studies. We find certificate pinning as much as 4 times more widely adopted than reported in recent studies. More specifically, we find that 0.9% to 8% of Android apps and 2.5% to 11% of iOS apps use certificate pinning at run time (depending on the aforementioned sets of apps). We then investigate which categories of apps most frequently use pinning (e.g., apps in the “finance” category), which destinations are typically pinned (e.g., first-party destinations vs those used by third-party libraries), which certificates are pinned and how these are pinned (e.g., CA vs leaf certificates), and the connection security for pinned connections vs unpinned ones (e.g., the use of weak ciphers or improper certificate validation). Lastly, we investigate how many pinned connections are amenable to binary instrumentation to reveal the contents of their connections; for those that are, we analyze the data sent over pinned connections to understand what is protected by pinning.
    • Comparing User Perceptions of Anti-Stalkerware Apps with the Technical Reality
      Fassl, M., Anell, S., Houy, S., Lindorfer, M., & Krombholz, K. (2022). Comparing User Perceptions of Anti-Stalkerware Apps with the Technical Reality. In Proceedings of the Eighteenth Symposium on Usable Privacy and Security (SOUPS 2022) (pp. 135–154). USENIX Association.
      DOI: 10.34726/3902 Metadata
      Abstract
      Every year an increasing number of users face stalkerware on their phones. Many of them are victims of intimate partner surveillance (IPS) who are unsure how to identify or remove stalkerware from their phones. An intuitive approach would be to choose anti-stalkerware from the app store. However, a mismatch between user expectations and the technical capabilities can produce an illusion of security and risk compensation behavior (i.e., the Peltzmann effect). We compare users’ perceptions of anti-stalkerware with the technical reality. First, we applied thematic analysis to app reviews to analyze user perceptions. Then, we performed a cognitive walkthrough of two prominent anti-stalkerware apps available on the Google Play Store and reverse-engineered them to understand their detection features. Our results suggest that users base their trust on the look and feel of the app, the number and type of alerts, and the apps’ affordances. We also found that app capabilities do not correspond to the users’ perceptions and expectations, impacting their practical effectiveness. We discuss different stakeholders’ options to remedy these challenges and better align user perceptions with the technical reality.
    • Not that Simple: Email Delivery in the 21st Century
      Holzbauer, F., Ullrich, J., Lindorfer, M., & Fiebig, T. (2022). Not that Simple: Email Delivery in the 21st Century. In Proceedings of the 2022 USENIX Annual Technical Conference (pp. 295–308). USENIX Association.
      DOI: 10.34726/4024 Metadata
      Abstract
      Over the past two decades, the number of RFCs related to email and its security has exploded from below 100 to nearly 500. This embedded the Simple Mail Transfer Protocol (SMTP) into a tree of interdependent and delivery-relevant standards. In this paper, we investigate how far real-world deployments keep up with this increasing complexity of delivery- and security options. To gain an in-depth picture of email delivery apart from the giants in the ecosystem (Gmail, Outlook, etc.), we engage people to send emails to eleven differently configured target domains. Our measurements allow us to evaluate core aspects of email delivery, including security features, DNS configuration, and IP version support on the sending side across different types of providers. We find that novel technologies are often insufficiently supported, even by large providers. For example, while 65.4% of email providers can resolve hosts via IPv6, only 44.3% can also deliver emails via IPv6. Concerning security features, we observe that less than half (41.5%) of all providers rely on DNSSEC validating resolvers, and encryption is mostly opportunistic, with 89.7% of providers accepting invalid certificates. TLSA, as a DNS-based certificate verification method, is only used by 31.7% of the providers in our study. Finally, we turned our eye to the impact modern standards have on unsolicited bulk email (SPAM). We found that greylisting is effective, reducing the SPAM volume by roughly half while not impacting regular delivery. However, and interestingly, SPAM delivery currently seems to focus on plaintext IPv4 connections, making IPv6-only, TLS-enforcing inbound email servers a more effective anti-SPAM measure—even though it also means rejecting a major portion of legitimate emails.
    • No Spring Chicken: Quantifying the Lifespan of Exploits in IoT Malware Using Static and Dynamic Analysis
      Al Alsadi, A. A., Sameshima, K., Bleier, J., Yoshioka, K., Lindorfer, M., van Eeten, M., & Hernández Gañán, C. (2022). No Spring Chicken: Quantifying the Lifespan of Exploits in IoT Malware Using Static and Dynamic Analysis. In Yuji Suga, Kouichi Sakurai, Xuhua Ding, & Kazue Sako (Eds.), ASIA CCS ’22: Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security (pp. 309–321). Association for Computing Machinery.
      DOI: 10.1145/3488932.3517408 Metadata
      Abstract
      The Internet of things (IoT) is composed by a wide variety of software and hardware components that inherently contain vulnerabilities. Previous research has shown that it takes only a few minutes from the moment an IoT device is connected to the Internet to the first infection attempts. Still, we know little about the evolution of exploit vectors: Which vulnerabilities are being targeted in the wild, how has the functionality changed over time, and for how long are vulnerabilities being targeted? Understanding these questions can help in the secure development, and deployment of IoT networks. We present the first longitudinal study of IoT malware exploits by analyzing 17,720 samples collected from three different sources from 2015 to 2020. Leveraging static and dynamic analysis, we extract exploits from these binaries to then analyze them along the following four dimensions: (1) evolution of infection vectors over the years, (2) exploit lifespan, vulnerability age, and the time-to-exploit of vulnerabilities, (3) functionality of exploits, and (4) targeted IoT devices and manufacturers. Our descriptive analysis uncovers several patterns: IoT malware keeps evolving, shifting from simply leveraging brute force attacks to including dozens of device-specific exploits. Once exploits are developed, they are rarely abandoned. The most recent binaries still target (very) old vulnerabilities. In some cases, new exploits are developed for a vulnerability that has been known for years. We find that the mean time-to-exploit after vulnerability disclosure is around 29 months, much longer than for malware targeting other environments.
    2021
    • Tarnhelm: Isolated, Transparent & Confidential Execution of Arbitrary Code in ARM's TrustZone
      Quarta, D., Ianni, M., Machiry, A., Fratantonio, Y., Gustafson, E., Balzarotti, D., Lindorfer, M., Vigna, G., & Kruegel, C. (2021). Tarnhelm: Isolated, Transparent & Confidential Execution of Arbitrary Code in ARM’s TrustZone. In Proceedings of the 2021 Research on offensive and defensive techniques in the Context of Man At The End (MATE) Attacks. ACM, Austria. ACM.
      DOI: 10.1145/3465413.3488571 Metadata ⯈Fulltext (preprint)
      Abstract
      Protecting the confidentiality of applications on commodity operating systems, both on desktop and mobile devices, is challenging: attackers have unrestricted control over an application´s processes and thus direct access to any of the application´s assets. However, the application´s code itself can be of great commercial value, for example in the case of proprietary code or additional functionality obtained as downloadable content and via in-app purchases, which are widely used to monetize free applications through premium content. Developers still rely heavily on obfuscation to protect their own code from unauthorized tampering or copying, providing an obstacle for an attacker, but not preventing compromise. In this paper, we present Tarnhelm, an approach to offer a practical and transparent primitive to implement code confidentiality by extending ARM´s TrustZone, a TEE that so far provides limited functionality to application developers. Tarnhelm allows develop- ers to easily designate part of their code as confidential through source code annotations. At compile time, Tarnhelm automatically partitions the application into regular application code, executed in the "normal world," and the invisible code, transparently executed in the "secure world." Tarnhelm tightly couples and secures the execution in both worlds without exposing any additional attack surface by combining a number of different techniques, such as secure code loading, system call forwarding, transparent world switching, and the enforcement of inter-world control-flow integrity. We implemented a proof of concept of Tarnhelm and demonstrate its feasibility in a mobile computing setting.
    2020
    • When Malware is Packin' Heat; Limits of Machine Learning Classifiers Based on Static Analysis Features
      Aghakhani, H., Gritti, F., Mecca, F., Lindorfer, M., Ortolani, S., Balzarotti, D., Vigna, G., & Krügel, C. (2020). When Malware is Packin’ Heat; Limits of Machine Learning Classifiers Based on Static Analysis Features. In Network and Distributed System Security Symposium (NDSS). Internet Society.
      Metadata ⯈Fulltext (preprint)
      Abstract
      Machine learning techniques are widely used in addition to signatures and heuristics to increase the detection rate of anti-malware software, as they automate the creation of detection models, making it possible to handle an ever-increasing number of new malware samples. In order to foil the analysis of anti-malware systems and evade detection, malware uses packing and other forms of obfuscation. However, few realize that benign applications use packing and obfuscation as well, to protect intellectual property and prevent license abuse. In this paper, we study how machine learning based on static analysis features operates on packed samples. Malware researchers have often assumed that packing would prevent machine learning techniques from building effective classifiers. However, both industry and academia have published results that show that machine-learning-based classifiers can achieve good detection rates, leading many experts to think that classifiers are simply detecting the fact that a sample is packed, as packing is more prevalent in malicious samples. We show that, different from what is commonly assumed, packers do preserve some information when packing programs that is "useful" for malware classification. However, this information does not necessarily capture the sample´s behavior. We demonstrate that the signals extracted from packed executables are not rich enough for machine-learning-based models to (1) generalize their knowl- edge to operate on unseen packers, and (2) be robust against adversarial examples. We also show that a na ̈ıve application of machine learning techniques results in a substantial number of false positives, which, in turn, might have resulted in incorrect labeling of ground-truth data used in past work.
    • FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic
      van Ede, T., Bortolameotti, R., Continella, A., Ren, J., Dubois, D., Lindorfer, M., Choffnes, D., van Steen, M., & Peter, A. (2020). FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic. In Network and Distributed System Security Symposium (NDSS). Internet Society.
      Metadata ⯈Fulltext (preprint)
      Abstract
      Mobile-application fingerprinting of network traffic is valuable for many security solutions as it provides insights into the apps active on a network. Unfortunately, existing techniques require prior knowledge of apps to be able to recognize them. However, mobile environments are constantly evolving, i.e., apps are regularly installed, updated, and uninstalled. Therefore, it is infeasible for existing fingerprinting approaches to cover all apps that may appear on a network. Moreover, most mobile traffic is encrypted, shows similarities with other apps, e.g., due to common libraries or the use of content delivery networks, and depends on user input, further complicating the fingerprinting process. As a solution, we propose FLOWPRINT, a semi-supervised approach for fingerprinting mobile apps from (encrypted) net- work traffic. We automatically find temporal correlations among destination-related features of network traffic and use these correlations to generate app fingerprints. Our approach is able to fingerprint previously unseen apps, something that existing techniques fail to achieve. We evaluate our approach for both Android and iOS in the setting of app recognition, where we achieve an accuracy of 89.2%, significantly outperforming state- of-the-art solutions. In addition, we show that our approach can detect previously unseen apps with a precision of 93.5%, detecting 72.3% of apps within the first five minutes of communication.
    • TXTing 101: Finding Security Issues in the Long Tail of DNS TXT Records
      der Toorn, O. van, van Rijswijk-Deij, R., Fiebig, T., Lindorfer, M., & Sperotto, A. (2020). TXTing 101: Finding Security Issues in the Long Tail of DNS TXT Records. In 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). IEEE.
      DOI: 10.1109/eurospw51379.2020.00080 Metadata ⯈Fulltext (preprint)
      Abstract
      The DNS TXT resource record is the one with the most flexibility for its contents, as it is a largely unstructured. Although it might be the ideal basis for storing any form of text-based information, it also poses a security threat, as TXT records can also be used for malicious and unintended practices. Yet, TXT records are often overlooked in security research. In this paper, we present the first structured study of the uses of TXT records, with a specific focus on security implications. We are able to classify over 99.54% of all TXT records in our dataset, finding security issues including accidentally published private keys and exploit delivery attempts. We also report on our lessons learned during our large-scale, systematic analysis of TXT records.
    2018
    • MineSweeper: An In-depth Look into Drive-by Cryptocurrency Mining and Its Defense
      Konoth, R. K., Vineti, E., Moonsamy, V., Lindorfer, M., Kruegel, C., Bos, H., & Vigna, G. (2018). MineSweeper: An In-depth Look into Drive-by Cryptocurrency Mining and Its Defense. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM.
      DOI: 10.1145/3243734.3243858 Metadata ⯈Fulltext (preprint)
      Abstract
      A wave of alternative coins that can be effectively mined without specialized hardware, and a surge in cryptocurrencies´ market value has led to the development of cryptocurrency mining (cryptomining) services, such as Coinhive, which can be easily integrated into websites to monetize the computational power of their visitors. While legitimate website operators are exploring these services as an alternative to advertisements, they have also drawn the attention of cybercriminals: drive-by mining (also known as cryptojacking) is a new web-based attack, in which an infected website secretly executes JavaScript code and/or a WebAssembly module in the user´s browser to mine cryptocurrencies without her consent. In this paper, we perform a comprehensive analysis on Alexa´s Top 1 Million websites to shed light on the prevalence and profitabil- ity of this attack. We study the websites affected by drive-by mining to understand the techniques being used to evade detection, and the latest web technologies being exploited to efficiently mine cryptocurrency. As a result of our study, which covers 28 Coinhive-like services that are widely being used by drive-by mining websites, we identified 20 active cryptomining campaigns. Motivated by our findings, we investigate possible countermeasures against this type of attack. We discuss how current blacklisting approaches and heuristics based on CPU usage are insufficient, and present MineSweeper, a novel detection technique that is based on the intrinsic characteristics of cryptomining code, and, thus, is resilient to obfuscation. Our approach could be integrated into browsers to warn users about silent cryptomining when visiting websites that do not ask for their consent.
    • Panoptispy: Characterizing Audio and Video Exfiltration from Android Applications
      Pan, E., Ren, J., Lindorfer, M., Wilson, C., & Choffnes, D. (2018). Panoptispy: Characterizing Audio and Video Exfiltration from Android Applications. In Proceedings on Privacy Enhancing Technologies (pp. 33–50). DeGruyter.
      DOI: 10.1515/popets-2018-0030 Metadata ⯈Fulltext (preprint)
      Abstract
      The high-fidelity sensors and ubiquitous internet connectivity offered by mobile devices have facilitated an explosion in mobile apps that rely on multi-media features. However, these sensors can also be used in ways that may violate user´s expectations and personal privacy. For example, apps have been caught taking pictures without the user´s knowledge and passively listened for inaudible, ultrasonic audio beacons. The developers of mobile device operating systems recognize that sensor data is sensitive, but unfortunately existing permission models only mitigate some of the privacy concerns surrounding multimedia data. In this work, we present the first large-scale empirical study of media permissions and leaks from Android apps, covering 17,260 apps from Google Play, AppChina, Mi.com, and Anzhi. We study the behavior of these apps using a combination of static and dynamic analysis techniques. Our study reveals several alarming privacy risks in the Android app ecosystem, including apps that over-provision their media permissions and apps that share image and video data with other parties in unexpected ways, without user knowledge or consent. We also identify a previously unreported privacy risk that arises from third-party libraries that record and upload screenshots and videos of the screen without informing the user and without requiring any permissions.
    • GuardION: Practical Mitigation of DMA-Based Rowhammer Attacks on ARM
      van der Veen, V., Lindorfer, M., Fratantonio, Y., Padmanabha Pillai, H., Vigna, G., Kruegel, C., Bos, H., & Razavi, K. (2018). GuardION: Practical Mitigation of DMA-Based Rowhammer Attacks on ARM. In Detection of Intrusions and Malware, and Vulnerability Assessment (pp. 92–113). Springer.
      DOI: 10.1007/978-3-319-93411-2_5 Metadata ⯈Fulltext (preprint)
      Abstract
      Over the last two years, the Rowhammer bug transformed from a hard-to-exploit DRAM disturbance error into a fully weaponized attack vector. Researchers demonstrated exploits not only against desktop computers, but also used single bit flips to compromise the cloud and mobile devices, all without relying on any software vulnerability. Since hardware-level mitigations cannot be backported, a search for software defenses is pressing. Proposals made by both academia and industry, however, are either impractical to deploy, or insufficient in stopping all attacks: we present rampage, a set of DMA-based Rowhammer attacks against the latest Android OS, consisting of (1) a root exploit, and (2) a series of app-to-app exploit scenarios that bypass all defenses. To mitigate Rowhammer exploitation on ARM, we propose guardion, a lightweight defense that prevents DMA-based attacks-the main attack vector on mobile devices-by isolating DMA buffers with guard rows. We evaluate guardion on 22 benchmark apps and show that it has a negligible memory overhead (2.2 MB on average). We further show that we can improve system performance by re-enabling higher order allocations after Google disabled these as a reaction to previous attacks.
    • Bug Fixes, Improvements, ... and Privacy Leaks - A Longitudinal Study of PII Leaks Across Android App Versions
      Ren, J., Lindorfer, M., Dubois, D. J., Rao, A., Choffnes, D., & Vallina-Rodriguez, N. (2018). Bug Fixes, Improvements, ... and Privacy Leaks - A Longitudinal Study of PII Leaks Across Android App Versions. In Proceedings 2018 Network and Distributed System Security Symposium. Internet Society.
      DOI: 10.14722/ndss.2018.23143 Metadata ⯈Fulltext (preprint)
      Abstract
      Is mobile privacy getting better or worse over time? In this paper, we address this question by studying privacy leaks from historical and current versions of 512 popular Android apps, covering 7,665 app releases over 8 years of app version history. Through automated and scripted interaction with apps and analysis of the network traffic they generate on real mobile devices, we identify how privacy changes over time for individual apps and in aggregate. We find several trends that include increased collection of personally identifiable information (PII) across app versions, slow adoption of HTTPS to secure the information sent to other parties, and a large number of third parties being able to link user activity and locations across apps. Interestingly, while privacy is getting worse in aggregate, we find that the privacy risk of individual apps varies greatly over time, and a substantial fraction of apps see little change or even improvement in privacy. Given these trends, we propose metrics for quantifying privacy risk and for providing this risk assessment proactively to help users balance the risks and benefits of installing new versions of apps.
    2017
    • Obfuscation-Resilient Privacy Leak Detection for Mobile Apps Through Differential Analysis
      Continella, A., Fratantonio, Y., Lindorfer, M., Puccetti, A., Zand, A., Kruegel, C., & Vigna, G. (2017). Obfuscation-Resilient Privacy Leak Detection for Mobile Apps Through Differential Analysis. In Proceedings 2017 Network and Distributed System Security Symposium. Internet Society.
      DOI: 10.14722/ndss.2017.23465 Metadata ⯈Fulltext (preprint)
      Abstract
      Mobile apps are notorious for collecting a wealth of private information from users. Despite significant effort from the research community in developing privacy leak detection tools based on data flow tracking inside the app or through network traffic analysis, it is still unclear whether apps and ad libraries can hide the fact that they are leaking private information. In fact, all existing analysis tools have limitations: data flow tracking suffers from imprecisions that cause false positives, as well as false negatives when the data flow from a source of private information to a network sink is interrupted; on the other hand, network traffic analysis cannot handle encryption or custom encoding. We propose a new approach to privacy leak detection that is not affected by such limitations, and it is also resilient to obfuscation techniques, such as encoding, formatting, encryption, or any other kind of transformation performed on private information before it is leaked. Our work is based on black- box differential analysis, and it works in two steps: first, it establishes a baseline of the network behavior of an app; then, it modifies sources of private information, such as the device ID and location, and detects leaks by observing deviations in the resulting network traffic. The basic concept of black-box differential analysis is not novel, but, unfortunately, it is not practical enough to precisely analyze modern mobile apps. In fact, their network traffic contains many sources of non-determinism, such as random identifiers, timestamps, and server-assigned session identifiers, which, when not handled properly, cause too much noise to correlate output changes with input changes. The main contribution of this work is to make black-box dif- ferential analysis practical when applied to modern Android apps. In particular, we show that the network-based non-determinism can often be explained and eliminated, and it is thus possible to reliably use variations in the network traffic as a strong signal to detect privacy leaks. We implemented this approach in a tool, called AGRIGENTO, and we evaluated it on more than one thousand Android apps. Our evaluation shows that our approach works well in practice and outperforms current state-of-the-art techniques. We conclude our study by discussing several case studies that show how popular apps and ad libraries currently exfiltrate data by using complex combinations of encoding and encryption mechanisms that other approaches fail to detect. Our results show that these apps and libraries seem to deliberately hide their data leaks from current approaches and clearly demonstrate the need for an obfuscation-resilient approach such as ours.
Presentations (created while at TU Wien)
    2022
    • ART-assisted App Diffing: Defeating Dalvik Bytecode Shrinking, Obfuscation, and Optimization with Android's OAT Compiler
      Bleier, J., & Lindorfer, M. (2022, May 23). ART-assisted App Diffing: Defeating Dalvik Bytecode Shrinking, Obfuscation, and Optimization with Android’s OAT Compiler [Poster Presentation]. 43rd IEEE Symposium on Security and Privacy, San Francisco, United States of America (the).
      Metadata
      Abstract
      Android aims to provide a secure and feature-rich, yet resource-saving platform for its applications (apps). To achieve these goals, the compilation to distributable packages shrinks, obfuscates, and optimizes the code by default. As an additional optimization, the Android Runtime (ART) nowadays compiles the app’s bytecode to native code on the device instead of executing it in the Dalvik VM. We study the effects of these changes in the Android build and runtime environment on the problem of calculating app similarity. We compare existing bytecode-based tools to our novel approach of using the recompiled (and optimized) binary form. We propose OATMEAL, an extensible framework to generate reliable ground truth for evaluating app similarity approaches and provide a benchmark dataset to the community. We built this dataset from open-source apps available on F-Droid in various configurations that optimize and obfuscate the bytecode. Using this dataset, we show the limitations of existing Android-specific bytecode analysis approaches when faced with the new optimizing R8 bytecode compiler. We further demonstrate how well BinDiff, a state-of-the-art binary-based alternative, works in scoring the similarity of apps. With OATMEAL, we provide the foundation for integrating and benchmarking further approaches, both for calculating the similarity between apps (based on bytecode or binary code), and for evaluating their robustness to evolving optimization and obfuscation techniques.