Skip to content

Commit

Permalink
Merge branch 'feature-cb-36025' into 'develop'
Browse files Browse the repository at this point in the history
CB-36025: Standalone mode

See merge request carbonblack/integrations/yara-connector!7
  • Loading branch information
zacharyestep committed Jul 14, 2021
2 parents eb81dff + 9b60e1e commit a58b5d6
Show file tree
Hide file tree
Showing 27 changed files with 1,035 additions and 1,929 deletions.
87 changes: 31 additions & 56 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,6 @@
# Installing YARA Agent (CentOS/RHEL 6/7/8)

[YARA](https://virustotal.github.io/yara/) Integration has two parts -- a primary and one or more minions. The primary
service must be installed on the same system as VMware CB EDR, while minions are usually installed on other systems (but
can also be on the primary system, if so desired). The YARA connector itself uses [Celery](http://www.celeryproject.org/)
to distribute work to and remote (or local) minions - you will need to install and configure a
[broker](https://docs.celeryproject.org/en/latest/getting-started/brokers/) (e.g., [Redis](https://redis.io/)) that is
accessible to both the primary and remote minion instance(s).
[YARA](https://virustotal.github.io/yara/)

The connector reads YARA rules from a configured directory to efficiently scan binaries as they are seen by the EDR server. T
he generated threat information is used to produce an intelligence feed for ingest by the EDR Server.
Expand All @@ -26,13 +21,7 @@ he generated threat information is used to produce an intelligence feed for inge

The installation process creates a sample configuration file: `/etc/cb/integrations/cb-yara-connector/yaraconnector.conf.example`. Copy
this sample template to `/etc/cb/integrations/cb-yara-connector/yaraconnector.conf`,
which is the filename and location that the connector expects. You will likely have to edit this
configuration file on each system (primary and minions) to supply any missing information:
* There are two operating modes to support the two roles: `mode=primary` and `mode=minion`. Both modes require a broker
for Celery communications. Minion systems will need to change the mode to `minion`;
* Remote minion systems will require the primary's URL for `cb_server_url` (local minions need no modification);
they also require the token of a global admin user for `cb_server_token`.
* Remote minions will require the URL of the primary's Redis server
which is the filename and location that the connector expects. Users must edit this file to supply any missing information:

The daemon will attempt to load the PostgreSQL credentials from the EDR server's `cb.conf` file,
if available, falling back to the PostgreSQL connection information in the primary's configuration file using the
Expand All @@ -41,37 +30,13 @@ if available, falling back to the PostgreSQL connection information in the prima

```ini
;
; Cb Response PostgreSQL Database settings, required for 'primary' and 'primary+minion' systems
; The server will attempt to read from local cb.conf file first and fall back
; to these settings if it cannot do so.
;
postgres_host=127.0.0.1
postgres_username=cb
postgres_password=<POSTGRES PASSWORD GOES HERE>
postgres_db=cb
postgres_port=5002
```

```ini
;
; EDR server settings, required for 'primary' and 'primary+minion' systems
; EDR server settings, required for standalone mode
; For remote workers, the cb_server_url mus be that of the primary
;
cb_server_url=https://127.0.0.1
cb_server_token=<API TOKEN GOES HERE>
```

You must configure `broker=` which sets the broker and results_backend for Celery.
Set this appropriately as per the [Celery documentation](https://docs.celeryproject.org/en/latest/getting-started/brokers/).

```ini
;
; URL of the Redis server, defaulting to the local EDR server Redis for the primary. If this is a minion
; system, alter to point to the primary system. If you are using a standalone Redis server, both primary and
; minions must point to the same server.
;
broker_url=redis://127.0.0.1
```
## Create your YARA rules

The YARA connector monitors the directory `/etc/cb/integrations/cb-yara-connector/yara_rules` for files (`.yar`) each
Expand Down Expand Up @@ -147,35 +112,45 @@ Provides the path containing the feed description file. If not supplied, defaul
`feed.json` in the same location as the configured `feed_database_dir` folder.

### --validate-yara-rules
If supplied, YARA rules will be validated and the script will exit.
If supplied, YARA rules will be validated and then the service will exit

# Development Notes
### Distributed operations
The Yara integration for EDR supports a distributed mode of operation where a primary instance queues binaries
to be scanned by a set of yara rules on a remote minion instance.

## Utility Script
Included with this version is a feature for discretionary use by advanced users and
should be used with caution.
The primary instance must be installed on an EDR primary node, and configured to access the EDR modulestore (postgres).
The minion instance must be installed on another machine, and needs to be configured with the API credentials for EDR.
The primary and minion communicate using the celery framework, which requires a celery-supported broker and
results backend.

When `utility_interval` is defined with a value greater than 0, it represents the interval
in minutes at which the YARA connector will pause its work and execute an external
shell script. A sample script, `vacuumscript.sh` is provided within the `scripts` folder
of the current YARA connector installation. After execution, the YARA connector continues with
its work.
* There are two operating modes to support the two roles: `mode=primary` and `mode=minion`. Both modes require a broker
for Celery communications. Minion systems will need to change the mode to `minion`;
* Remote minion systems will require the primary's URL for `cb_server_url` (local minions need no modification);
they also require the token of a global admin user for `cb_server_token`.
* Remote minions will require the URL of the desired celery broker & results backend

> _**NOTE:** As a safety for this feature, if an interval is defined but no script is defined, nothing is done.
> By default, no script is defined._
The primary service must be installed on the same system as VMware CB EDR, while minions are usually installed on other systems (but
can also be on the primary system, if so desired). The YARA connector itself uses [Celery](http://www.celeryproject.org/)
to distribute work to and remote (or local) minions - you will need to install and configure a
[broker](https://docs.celeryproject.org/en/latest/getting-started/brokers/) (e.g., [Redis](https://redis.io/)) that is
accessible to both the primary and remote minion instance(s).

You must configure `broker=` which sets the broker and can optionally configure `results_backend=` for Celery.
Set this appropriately as per the [Celery documentation](https://docs.celeryproject.org/en/latest/getting-started/brokers/).

```ini
;
; The use of the utility script is an ADVANCED FEATURE and should be used with caution!
; URL of the celery broker, typically the EDR local redis service
;
; If "utility_interval" is greater than 0 it represents the interval in minutes after which the YARA connector will
; pause to execute a shell script for general maintenance. This can present risks. Be careful what you allow the
; script to do, and use this option at your own discretion.
broker_url=redis://127.0.0.1
;
; the URL of the desired results backend, either redis again or another supported backend
;
utility_interval=-1
utility_script=./scripts/vacuumscript.sh
results_backend=redis://
```

# Development Notes

## YARA Agent Build Instructions

The dockerfile in the top-level of the repo contains a CentOS 7 environment for running, building, and testing
Expand Down
2 changes: 1 addition & 1 deletion build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,4 @@ val buildTask = tasks.named("build").configure {

python {
sourceExcludes.add("smoketest/")
}
}
4 changes: 2 additions & 2 deletions cb-yara-connector.rpm.spec
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
%define name python-cb-yara-connector
%define version 2.1.2
%define bare_version 2.1.2
%define version 2.2.0
%define bare_version 2.2.0
%define release 1

%global _enable_debug_package 0
Expand Down
26 changes: 6 additions & 20 deletions example-conf/yara.conf
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
[general]

;
; Operating mode - choose 'primary' for the main system, 'minion' for a remote minion.
; Operation Mode
;
mode=primary
mode=standalone

;
; path to directory containing yara rules
;
yara_rules_dir=/etc/cb/integrations/cb-yara-connector/yara_rules

;
; EDR PostgreSQL database settings, required for 'primary' systems
; EDR PostgreSQL database settings
; The server will attempt to read from local cb.conf file first and fall back
; to these settings if it cannot do so.
; to these settings if it cannot do so. These settings are not required
;
postgres_host=127.0.0.1
postgres_username=cb
Expand All @@ -22,16 +22,13 @@ postgres_db=cb
postgres_port=5002

;
; Cb Response Server settings, required for 'minion' systems
; For remote minions, the cb_server_url must be that of the primary
; Cb Response Server settings, required for standalone mode - This will be used to fetch binaries that are not available
;
cb_server_url=https://127.0.0.1
cb_server_token=<API TOKEN GOES HERE>

;
; URL of the redis server, defaulting to the local response server redis for the primary. If this is a minion
; system, alter to point to the primary system. If you are using a standalone redis server, both primary and
; minions must point to the same server
; the celery broker to use for distributed operations (not required in standalone mode)
;
broker_url=redis://localhost:6379

Expand Down Expand Up @@ -59,17 +56,6 @@ disable_rescan=False
;
num_days_binaries=365


;
; The use of the maintenance script is an ADVANCED FEATURE and should be used with caution!
;
; If "utility_interval" is greater than 0 it represents the interval in minutes after which the yara connector will
; pause to execute a user-created shell script designed for database maintenance, located with a "utility_script"
; definition that must be added. This can present risks. Be careful what you allow the script to do, and use this
; option at your own discretion.
;
utility_interval=0

;
; This can be used to adjust the interval (in seconds) at which the database is scanned.
;
Expand Down
4 changes: 2 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# All versions are latest viable at date of package version release
################################################################################

celery==4.4.0
celery==4.4.7
humanfriendly==4.18
peewee==3.13.1
psycopg2-binary==2.8.4
Expand All @@ -14,4 +14,4 @@ simplejson==3.17.0
urllib3==1.25.7
yara-python==3.11.0
pyinstaller==4.2
cbfeeds==1.0.0
cbfeeds==1.0.0
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ def run(self):

setup(
name='python-cb-yara-connector',
version='2.1.2',
version='2.2.0',
packages=['cbopensource', 'cbopensource.connectors', 'cbopensource.connectors.yara_connector'],
package_dir={'': 'src'},
url='https://github.com/carbonblack/cb-yara-connector',
Expand Down
30 changes: 24 additions & 6 deletions smoketest/cmd.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,14 @@ echo "creating mock postgres..."
sudo -u cb /usr/bin/postgres -D /postgres -p 5002 -h 127.0.0.1 &
sleep 3
sudo -u cb createdb -p 5002 && echo 'createdb ok!'
sudo -u cb psql -p 5002 -c "CREATE TABLE storefiles(md5hash BYTEA NOT NULL PRIMARY KEY, timestamp TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, present_locally BOOLEAN DEFAULT true ) ;"
sudo -u cb psql -p 5002 -c "insert into storefiles(md5hash) values('adfasfdsafdsafdsa');"
sudo -u cb psql -p 5002 -c "CREATE TABLE storefiles(md5hash BYTEA NOT NULL PRIMARY KEY, timestamp TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, present_locally BOOLEAN DEFAULT true, node_id INTEGER DEFAULT 0) ;"
sudo -u cb psql -p 5002 -c "insert into storefiles(md5hash, node_id) values('\x45f48b1aacccd555bbc84c9df0ce2fa6', 0);"
sudo -u cb psql -p 5002 -c "insert into storefiles(md5hash, node_id) values('\x14018eb9e2f4488101719c4d29de2230', 1);"
mkdir -p /var/cb/data/modulestore/45F/48B/
echo "45F48B1AACCCD555BBC84C9DF0CE2FA6" >> filedata
zip 45F48B1AACCCD555BBC84C9DF0CE2FA6.zip filedata
mv 45F48B1AACCCD555BBC84C9DF0CE2FA6.zip /var/cb/data/modulestore/45F/48B/45F48B1AACCCD555BBC84C9DF0CE2FA6.zip

echo 'mock postgres creation complete -- starting redis'
sudo systemctl start redis && echo 'redis started ok'

Expand All @@ -36,14 +42,26 @@ cp $2/yaraconnector.conf /etc/cb/integrations/cb-yara-connector/
mkdir -p /etc/cb/integrations/cb-yara-connector/yara_rules
cp $2/smoketest.yar /etc/cb/integrations/cb-yara-connector/yara_rules

#sleep 99999
systemctl start cb-yara-connector

#give the connector some time to run, then check the feed.json for matches
#sleep 9999999999
sleep 5
grep "Matched yara rules: smoketest" /var/cb/data/cb-yara-connector/feed.json >/dev/null || echo "Yara connector not working..."
echo "Yara connector working ok!"

systemctl stop cb-yara-connector
#sleep 99999

count=$(grep -c "Matched yara rules: smoketest" /var/cb/data/cb-yara-connector/feed.json)
echo "count is $count"
if [ "$count" = "4" ]
then
echo "Yara connector working ok!"
else
echo "Yara connector not working ok!"
exit 1
fi

log_line_count=$(wc -l /var/log/cb/integrations/cb-yara-connector/yaraconnector.log)
echo "Log line count is $log_line_count"

yum -y remove python-cb-yara-connector

Expand Down
77 changes: 45 additions & 32 deletions src/cbopensource/connectors/yara_connector/analysis_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,48 +4,61 @@
from celery.exceptions import WorkerLostError

from cbopensource.connectors.yara_connector.loggers import logger
from .tasks import analyze_binary
from . import globals
from .tasks import analyze_binary, analyze_binary_task


def analysis_minion(exit_event: Event, hash_queue: Queue, scanning_results_queue: Queue) -> None:
"""
The promise worker scanning function.
:param exit_event: event signaller
:param hash_queue
:param scanning_results_queue: the results queue
"""
def analysis_minion(thread_num: int, exit_event: Event, hash_queue: Queue, scanning_results_queue: Queue, chunked=True,
max_hashes: int = 8) -> None:
logger.debug(f"Analysis thread {thread_num} starting")
exception = None
try:
while not (exit_event.is_set()):
if not (hash_queue.empty()):
try:
exit_set = False
md5_hashes = hash_queue.get()
promise = analyze_binary.chunks(
[(mh,) for mh in md5_hashes], globals.g_max_hashes
).apply_async()
for prom in promise.children:
exit_set = exit_event.is_set()
if exit_set:
break
results = prom.get(disable_sync_subtasks=False)
scanning_results_queue.put(results)
if not exit_set:
promise.get(disable_sync_subtasks=False, timeout=1)
if chunked:
handle_chunked(hash_queue, exit_event, scanning_results_queue, max_hashes=max_hashes)
else:
promise.forget()
hash_queue.task_done()
handle_single(hash_queue, scanning_results_queue)
except Empty:
exit_event.wait(1)
except WorkerLostError as err:
logger.debug(f"Lost connection to remote minion and exiting: {err}")
exit_event.set()
break
exception = err
logger.exception(f"Lost connection to remote minion and exiting: {err}")
except Exception as err:
logger.debug(f"Exception in wait: {err}")
exit_event.wait(0.1)
logger.exception(f"Error in analysis worker: {err}")
exception = err
finally:
hash_queue.task_done()
else:
exit_event.wait(1)
exit_event.wait(0.25)
finally:
logger.debug(f"ANALYSIS MINION EXITING {exit_event.is_set()}")
if exit_event.is_set():
logger.debug(f"Analysis worker {thread_num} exiting")
else:
logger.exception(f"Analysis worker {thread_num} exiting due to error {exception}")


def handle_chunked(hash_queue, exit_event, scanning_results_queue, max_hashes: int = 8):
exit_set = False
md5_hashes = hash_queue.get()
promise = analyze_binary_task.chunks(
[(mh[0], mh[1]) for mh in md5_hashes], max_hashes
).apply_async()
for prom in promise.children:
exit_set = exit_event.is_set()
if exit_set:
break
results = prom.get(disable_sync_subtasks=False)
scanning_results_queue.put(results)
if not exit_set:
promise.get(disable_sync_subtasks=False, timeout=1)
else:
promise.forget()


def handle_single(hash_queue, scanning_results_queue):
md5_hash, node_id = hash_queue.get()
logger.debug(f"Analyzing hash {md5_hash}")
result = analyze_binary(md5_hash, node_id)
logger.debug(f"Done Analyzing {md5_hash}")
scanning_results_queue.put(result)
Loading

0 comments on commit a58b5d6

Please sign in to comment.