top of page
Search
cfascinate2014

Alware Bytes: The Best Cyber Security Software for Home and Business



Malwarebytes Inc. ist eine US-amerikanische IT-Sicherheitsfirma aus Santa Clara, die die Softwarelösung Malwarebytes zur Identifizierung und Beseitigung von Schadsoftware für die Betriebssysteme Windows, Android und macOS anbietet. 2017 erwirtschaftete Malwarebytes Inc. einen Umsatz von 126,2 Millionen US-Dollar.[1]


Malwarebytes wurde im Januar 2008 von Marcin Kleczynski und Bruce Harrison gegründet. Die Ursprünge des Unternehmens reichen jedoch bis ins Jahr 2004 zurück, als Kleczynski sich nach eigener Aussage selbst das Programmieren beibrachte, nachdem sein Computer durch eine Schadsoftware befallen worden war.[2]




Alware Bytes




Im Jahr 2011 erwarb Malwarebytes das Unternehmen "HPhosts", das Websites und Ad-Server auf eine schwarze Liste setzt. Im Juni 2015 gab das Unternehmen bekannt, dass es seinen Hauptsitz vom 10 Almaden Boulevard in San Jose, Kalifornien, in ein neues, 4800 m großes Büro in den beiden obersten Etagen des 12-stöckigen Freedom Circle 3979 im US-Bundesstaat Kalifornien verlegt. Das Unternehmen verzeichnete von 2014 bis 2015 ein Wachstum von 10 Millionen Nutzern in nur einem Jahr und eine Umsatzsteigerung von 1653 % im Jahr 2014. Bis 2015 hatte Malwarebytes weltweit über 250 Millionen Computer und über fünf Milliarden Malware-Bedrohungen behoben. Aktuell hat Malware Bytes über 700 Beschäftigte in 18 Ländern.[3]


The main purposes of these hashes are identification and blocklisting of samples. Using them for blocklisting makes sense because an attacker will have difficulty to design a malware with the same hash value as a clean file. They are ideal for identification because cryptographically secure hashes are meant to make collisions unlikely.


MD5 and SHA-1 should not be used anymore because they have been broken [fisher20][kashyap06]. E.g. for MD5 people can create hash collisions in a way that allows control over the content [kashyap06]. But both are still sometimes used in hash listings of malware articles and some detection technologies might still work with MD5 hashes because computing them is fast and the values don't need much storage space. Therefore it is an important and common search option for sample databases.


The development of ssdeep was a milestone at the time. New hashing algorithms which improve certain aspects of ssdeep have been created since. E.g., SimFD has a better false positive rate and MRSH improved security aspects of ssdeep [breitinger13]. The author's website states that ssdeep is still often preferred due to its speed (e.g., compared to TLSH) and it is the "de facto standard" for fuzzy hashing algorithms used for malware samples and their classification. Sample databases like VirusTotal and Malwarebazaar support it.


TLSH stands for Trend-Micro Locality Sensitive Hash, which was published in a paper in 2013 [oliver13]. According to their paper TLSH has better accuracy than ssdeep when classifying malware samples [p.12, oliver13]. Just like ssdeep it is a CTPH. TLSH is supported by VirusTotal.


The idea of SIF hashing is to find features of a file that are unlikely present by chance and compare those features to other files. Sdhash uses entropy calculation to pick the relevant features and then creates the hash value based on them. That also means sdhash cannot fully cover a file and modifications to a file may not influence the hash value at all if they are not part of a statistically-improbable feature. Sdhash shows better accuracy than ssdeep when classifying malware samples [p.12, oliver13][roussev11]. However, its strong suit is the detection of fragments and not comparison of files [p.8, breitinger12].


Control flow graph hashes are not only useful for AV detection and sample clustering. They are also suitable to get a binary diff for samples, i.e., to identify similar and different functions in two samples. Binary diffing is a common technique for malware analysis to find differences between two versions of a malware family or identify re-used code in different malware families. Control flow hashing may also be applied to automatically rename known functions, thus, improve the readability of disassembled code for reversers.


All of these hashing algorithms work with imported functions, types or modules. The idea is that the imports indicate behavioral capabilities of a malware, so a hash value will hopefully be the same for samples with similar capabilities.


The choice of using a cryptographic hash as intermediate step for import hashing is not ideal when keeping in mind that the idea behind ImpHash was to cluster samples of similar behavioral capabilities. Algorithms like ImpHash and TypeRefHash only determine clusters of samples that have exactly the same imports. Fuzzy hash values look similar if the input was similar. That is is why algorithms like ImpFuzzy were created, which uses ssdeep instead of MD5. A recent study [naik20] shows better results in malware classification tasks for fuzzy import hashing methods that employ ssdeep, sdHash or mvHash-B compared to MD5 for the ImpHash.


The ImpFuzzy blog post evaluates malware family classification for 200 non-packed samples using either ssdeep for the whole file, ImpHash (MD5 on imports) or ImpFuzzy (ssdeep on imports). For this specific test setup, ImpFuzzy shows consistently better success rates than the other two hashing algorithms (see image below) but the author also states that this setting creates false positives.


The hash value is created by converting the input size to 4 bytes, then mapping each byte to a wordlist. The author states its uniqueness is 1 in 4.3 billion. This hash is not robust against collisions, but it does not have to be.


In my personal opinion more sample sharing platforms should add humanhash to their list of hashes. E.g., it would be a great addition to VirusTotal. Malwarebazaar supports humanhash and seeing it among the other hash values (image below) makes apparent what this sample will be remembered by apart from the filename and AgentTesla tag.


Checking icon similarity is especially useful if malware pretends to be a known application or office document. E.g., it is common for malware to try to appear as Word or PDF document by using icons for these applications in combination with double extensions like pdf.exe or file extension spoofing. Detecting such malware techniques with signatures or searching for them in databases is possible with similarity hashes that are specifically for comparing pictures, e.g., VirusTotal and Malwarebazaar support searches via dHash.


This cryptographic hash is computed on signed PE files and an important part of Microsoft's digital signature format Authenticode. Its purpose is to verify that a file has not been tampered with after it has been signed by a software publisher. File manipulation would result in a different hash value than the one listed in the file's digital signature. The Authentihash includes the PE image excluding certificate related data and overlay. That means appended data does not affect the hash value which has been abused by polyglot malware, that is malware that has several file types at once. More details about such malware is in the article "Code-Signing: How Malware gets a Free Pass"


[kim20] Kim, Jun-Seob & Jung, Wookhyun & Kim, Sangwon & Lee, Shinho & Kim, Eui. (2020). Evaluation of Image Similarity Algorithms for Malware Fake-Icon Detection. 1638-1640. 10.1109/ICTC49870.2020.9289501.


[naik20] N. Naik, P. Jenkins, N. Savage, L. Yang, T. Boongoen and N. Iam-On, "Fuzzy-Import Hashing: A Malware Analysis Approach," 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2020, pp. 1-8, doi: 10.1109/FUZZ48607.2020.9177636.


Upon opening the executable in IDA, we can see that most of the assembly code does not make sense and is not too meaningful. An example can be seen from WinMain, where there is no clear return statement with garbage bytes popping up among valid code.


As shown in the disassembled code above, the control flow in WinMain calls sub_4142F5, and upon return, edi is popped and we run into the garbage bytes at 0x4142F2. As a result, IDA fails to decompile this code properly.


Before processing a drive, the malware extracts the following ransom note content before dropping it into the drive folder. This is the only place where the ransom note is dropped instead of in every folder like other ransomware.


Once the structure is found, the malware sets its initialized_flag field to 1 and the filename field to the target filename. It also populates other fields such as the file size, large file flag, and file handle.


If the file is not classified as a large file, the malware calculates how many chunks it needs to encrypt depending on the file size. The number of encrypted chunks is 2 if the file size is less than or equal to 0x3fffffff bytes, 3 if the file size is less than or equal to 0x27fffffff bytes and greater than 0x3fffffff bytes, and 0 if the file size is equal to 0x280000000. If the file size is greater than 0x280000000 bytes, then the number of encrypted chunks is 5.


First, PLAY reads 0x428 bytes at the end of the file to check the file footer. If the file size is smaller than 0x428 bytes, the file is guaranteed to not be encrypted, so the malware moves to encrypt it immediately.


If the last 0x428 bytes is read successfully, the malware then checks if the xxHash32 hash of the footer marker head is equal to the footer marker tail. If they are, then the file footer is confirmed to be valid, and the file is already encrypted.


It calls BCryptGenRandom to generate a random 0x20-byte buffer. Depending on the chaining mode specified in the file structure, the malware calls BCryptSetProperty to set the chaining properly for its AES provider handle.


Once the file footer is fully populated, the malware calls SetFilePointerEx to move the file pointer to the end of the file and calls WriteFile to write the structure there. 2ff7e9595c


0 views0 comments

Recent Posts

See All

Comments


bottom of page