About truffleHog
truffleHog is a powerful data mining tool that can help researchers easily search for high-entropy strings and sensitive data in the target Git repository, and we can improve the security of our code repository based on this information. The tool can search for potential sensitive information by deeply analyzing the commit history and code branches of the target Git repository.
Operation Mechanism
This tool will traverse the entire commit history of each branch in the target Git repository, check each commit's Diff, and check for any sensitive data that may exist. This is determined by regular expressions and entropy. For entropy checks, truffleHog will evaluate the Shannon entropy of the base64 character set and hexadecimal character set of text blocks longer than 20 characters in each Diff. If a high-entropy string longer than 20 characters is detected at any time, it will print the relevant data to the screen.
Tool Installation

This tool is developed based on Python, so researchers can use the pip command to complete the installation of the tool:
pip install truffleHog
Custom Configuration
We can add custom regular expressions using "--rules /path/to/rules", the method is to use a JSON file, the format of which is as follows:
{ "RSA private key": "-----BEGIN EC PRIVATE KEY-----" {}
{ "local self signed test key": "-----BEGIN EC PRIVATE KEY-----\nfoobar123\n-----END EC PRIVATE KEY-----" "git cherry pick SHAs": "regex:Cherry picked from .*" {}
Note that previous versions of truffleHog ran entropy checks on git Diff. This feature is still present in the current version, but it has been enhanced with high-signal regular expression checks, and it also includes the ability to suppress entropy checks:
trufflehog --regex --entropy=False https://github.com/dxa4481/truffleHog.git
or
trufflehog file:///user/dxa4481/codeprojects/truffleHog/
With the help of the "--include_paths" and "--exclude_paths" options, we can also limit the scan to a subset of objects in the Git history by defining regular expressions in the file (one per line). The following is an example of a regular expression file for reference:
include-patterns.txt: src/ # lines beginning with \ gradle/ # regexes must match the entire path, but can use python's regex syntax for # case-insensitive matching and other advanced options (?i).*\.(properties|conf|ini|txt|y(a)?ml)$ (.*\/)?id_[rd]sa$ exclude-patterns.txt: (.*\/)?\.classpath$ .*\.jmx$ (.*\/)?test\/(.*\/)?resources\/
These filter files can be deployed and used with the following command:
trufflehog --include_paths include-patterns.txt --exclude_paths exclude-patterns.txt file://path/to/my/repo.git
With the help of these filters, the tool can discover and report issues in the root directory of the target Git repository. At the same time, we can also use the "-h" and "--help" commands to view more useful information.
Tool help information
usage: trufflehog [-h] [--json] [--regex] [--rules RULES] [--allow ALLOW] [--entropy DO_ENTROPY] [--since_commit SINCE_COMMIT] [--max_depth MAX_DEPTH] git_url Find secrets hidden in the depths of git. positional arguments: git_url URL for secret searching optional arguments: -h, --help show this help message and exit --json Output in JSON --regex Enable high signal regex checks --rules RULES Ignore default regexes and source from json list file --allow ALLOW Explicitly allow regexes from json list file --entropy DO_ENTROPY Enable entropy checks --since_commit SINCE_COMMIT Only scan from a given commit hash --branch BRANCH Scans only the selected branch --max_depth MAX_DEPTH The max commit depth to go back when searching for secrets -i INCLUDE_PATHS_FILE, --include_paths INCLUDE_PATHS_FILE File with regular expressions (one per line), at least one of which must match a Git object path in order for it to be scanned; lines starting with "#" are treated as comments and are ignored. If empty or not provided (default), all Git object paths are included unless otherwise excluded via the --exclude_paths option. -x EXCLUDE_PATHS_FILE, --exclude_paths EXCLUDE_PATHS_FILE File with regular expressions (one per line), none of which may match a Git object path in order for it to be scanned; lines starting with "#" are treated as comments and are ignored. If empty or not provided (default), no Git object paths are excluded unless effectively excluded via the --include_paths option.
Combined with Docker usage
Firstly, we need to enter the directory containing the target Git repository:
cd /path/to/git
Then start truffleHog through the Docker image and run the following commands:
docker run --rm -v "$(pwd):/proj" dxa4481/trufflehog file:///proj
The "-v" option loads the current working directory (pwd) into the /proj directory of the Docker container.
"file:///proj" includes a reference to the "proj" directory within the container.
Tool usage example
Project address
truffleHog:【GitHub Gateway】
Reference materials
https://join.slack.com/t/trufflehog-community/shared_invite/zt-pw2qbi43-Aa86hkiimstfdKH9UCpPzQ

评论已关闭