Assemblyline – Distributed File Analysis Framework

assemblylineAssemblyline is a scalable distributed file analysis framework. It is designed to process millions of files per day but can also be installed on a single box.

Canada’s electronic spy agency says it is taking the “unprecedented step” of releasing one of its own cyber defence tools to the public, in a bid to help companies and organizations better defend their computers and networks against malicious threats.

An Assemblyline cluster consists of 3 types of boxes: Core, Datastore and Worker.


Assemblyline Core

The Assemblyline Core server runs all the required components to receive/dispatch tasks to the different workers. It hosts the following processes:

  • Redis (Queue/Messaging)
  • FTP (proftpd: File transfer)
  • Dispatcher (Worker tasking and job completion)
  • Ingester (High volume task ingestion)
  • Expiry (Data deletion)
  • Alerter (Creates alerts when score threshold is met)
  • UI/API (NGINX, UWSGI, Flask, AngularJS)
  • Websocket (NGINX, Gunicorn, GEvent)

Assemblyline Datastore

Assemblyline uses Riak as its persistent data storage. Riak is a Key/Value pair datastore with SOLR integration for search. It is fully distributed and horizontally scalable.


Add Comment