Current methods of software identification rely on file hashes. We are interested in developing a cyber genome or application disk-print that can identify software using multiple characteristics including analysis of the registry, memory, fuzzy hashes, runtime characteristics, block hashes, and other attributes that have yet to be defined. The National Software Reference Library contains over 11,000 software applications stored online in a research environment that can be used to support the research.
File hashes; Fuzzy hashes; Block hashes; Cyber genome; National Software Reference Library;