Predicting future terrorist targets with Amazon Neptune ML

Olalekan Elesin
7 min readJan 23, 2022
Modeling Terrorist Attacks in Nigeria as a Graph Network

More often than not, we see graph-based recommendation engines in real-world applications, ranging from social networks to product recommendations for e-commerce platforms. Many of us have come to rely on recommendation systems to make new professional relationships (LinkedIn), make purchase decisions (Amazon) or even discover new music to listen to (Spotify). In social networks (e.g. LinkedIn), new connection suggestions or recommendations are shown to a user based on the users’ other connections. Users with common connections are likely to know or want to connect with each other. Therefore, we can easily express social networks as Graphs.

This blogpost was inspired by a recent blog from AWS:

Context

Terrorist attacks may be naturally expressed as a graph network, where the nodes represent the targeted locations, and the connections between the targeted locations, time sequence between the attacks, are represented represented as the edges. The following illustrates an example network. If we image that we have a network of terrorist targets with the locations (nodes) Maina Hari, Pulka, Mandaragirau, Tungushe, Kaiga, Konduga and Gasarwa. Their relationships are represented by a link (edge), and each location’s target attributes such as military base, religious institutions, NGOs, private citizens & property, police stations, educational institutions, and government offices, are represented by the node properties.

Boko Haram Attacks in Nigeria as a Graph Network

The objective here is to predict next possible target location if there is a potential missing link between the locations. For example, should we alert security agencies that there might be a connection between Pulka and Tungushe? Looking at the graph, we can see they have 2 locations in common, Ala and Maina Hari.. Therefore, there is a possibility that Tungushe and Pulka may be attacked. These are many more questions and intuitions are the main logic or reasoning of network recommendation systems.

In the remainder of this blogpost, using the Global Terrorism Database, we will demonstrate how to express terrorist attacks as a social network. We will then use Amazon Neptune ML and Amazon SageMaker to train and deploy recommendation engine that can predict next possible terrorist attacks. The outcome of system as this is to improve intelligence based on AI in countering terrorism (or even organized crime).

We will walk through how to use Graph Neural Networks (GNNs) for recommending next possible locations to be attacked as a link prediction problem. We will train a GNN model with Amazon Neptune ML, and make inferences on the demo graph through a link prediction task. Sample code is available on GitHub (on request).

Link prediction with Graph Neural Networks

Consider the previous network of past terrorist targets, we would like to anticipate/recommend likely locations a terrorist group might attack. We would like to augment the intelligence gathering of security operatives with information on potential target locations. Konduga and Tungushe share common city/town attributes, but no common links. Which would be a better recommendation? When framed as a link prediction problem, the task is assigned to a score to any possible between the two nodes. The higher the link score, the more likely this recommendation will converge. By the learning link structures already present in the graph, a link prediction model can generalize new link predictions that ‘complete’ the graph.

To learn more about link prediction in Graph Neural Networks and how GNNs work, check the blogpost from AWS, Graph-based recommendation system with Neptune ML: An illustration on social network link prediction challenges.

Train our Graph Convolution Network with Amazon Neptune ML

Neptune ML uses graph neural network technology to automatically create, train, and deploy ML models on your graph data. Neptune ML supports common graph prediction tasks, such as node classification and regression, edge classification and regression, and link prediction. Neptune ML is powered by:

  • Amazon Neptune: a fast, reliable, and fully managed graph database, which is optimized for storing billions of relationships and querying the graph with millisecond latency.
  • Amazon SageMaker: a fully managed service that provides every developer and data scientist with the ability to prepare build, train, and deploy ML models quickly.
  • Deep Graph Library (DGL): an open-source, high-performance, and scalable Python package for DL on graphs.

You can easily get started with Neptune ML with the AWS CloudFormation quickstart template. It sets up the necessary infrastructure components including a Neptune DB cluster, and sets up the network configurations, IAM roles, and associated SageMaker notebook instance with pre-populated notebook samples for Neptune ML. Don’t forget to delete the stack once you’re done with it.

Raw Data Preparation

As already mentioned, we will make use of the Global Terrorism Database. For this example, we will make use of a subset of the data available from the website directly.

Pandas read_html table from Global Terrorism Database website

Now that our data sample is ready, it’s time to train our GNN model with Amazon Neptune ML. The figure below shows the steps for Neptune ML to train a GNN-based recommendation system.

Source: Graph-based recommendation system with Neptune ML: An illustration on social network link prediction challenges

Data export configuration

The first step in our Neptune ML process is to export the graph data from the Neptune cluster. We must specify the parameters and model configuration for the data export task. We use the Neptune workbench for all of the configurations and commends. The workbench lets us work with the Neptune DB cluster using Jupyter notebooks hosted by Amazon SageMaker. In addition, it provides a number of magic commands in the notebooks that save a great deal of time and effort. Here is our example of export parameters:

Neptune ML export params

See Export data from Neptune for Neptune ML for more information on the required configurations for export_params .

Data preprocessing

Neptune ML performs feature extraction and encoding as part of the data-processing steps. Common types of property pre-processing include: encoding categorical features through one-hot encoding, bucketing numerical features, or using word2vec to encode a string property or other free-form text property values.

To run data preprocessing, use the following Neptune notebook magic command: %neptune_ml dataprocessing start

See Processing the graph data exported from Neptune for training for more information.

Model training and deployment

The next step is the automated training of the GNN model. The model training is done in two stages. The first stage uses a SageMaker Processing job to generate a model training strategy. This is a configuration set that specifies what type of model and model hyperparameter ranges will be used for the model training. Then, a SageMaker hyperparameter tuning job will be launched.

To start the training step, you can use the %neptune_ml training start command.

Once training step is completed, we can deploy the GNN model behind a SageMaker endpoint to start serving predictions in real-time. The model input will be the City for which we need to identify possible terrorist targets after it is attacked, along with the edge type, and the output will be the list of the likely terrorist targets based on that city.

To deploy the model to the SageMaker endpoint instance, use the %neptune_ml endpoint create command.

See Training a model using Neptune ML for more information.

Query the ML model using Gremlin

Once the endpoint is ready, we can use it for graph inference queries. Neptune ML supports graph inference queries in Gremlin or SPARQL. In our example, we can now check the potential terrorist targets with Neptune ML on City “Ala”. It requires nearly the same syntax to traverse the edge, and it lists the other Cities that are connected to Ala through the ATTACKED_AFTER connection.

%%gremlin 
g.with("Neptune#ml.endpoint","${endpoint_name}"). V().hasLabel('City').has('name', 'Ala'). out('ATTACKED_AFTER').with("Neptune#ml.prediction").hasLabel('City').values('name')

Below is another example that can be used to predict the top eight cities that are most likely to be attacked after an attack on “Ala”:

%%gremlin 
g.with("Neptune#ml.endpoint","${endpoint_name}").with("Neptune#ml.limit",8).V().hasLabel('City').has('name', 'Ala'). out('ATTACKED_AFTER').with("Neptune#ml.prediction").hasLabel('City').values('name')

See Gremlin inference queries in Neptune ML for more information.

Conclusion

This post was inspired by the AWS blogpost Graph-based recommendation system with Neptune ML: An illustration on social network link prediction challenges. We demonstrated how Neptune ML and GNNs can help augment intelligence gathering for security agencies in foiling terrorist attacks using link prediction tasks, combining information from complex interaction patterns in the graph.

For a production-ready implementation, the solution would need to be trained on a larger dataset. We may also consider enriching the datasets, specifically, the locations (nodes) with other attributes weather condition, distance to specific points of interest, etc.

Want to build your next AI use case on AWS? You can reach me via email, follow me on Twitter or connect with me on LinkedIn and subscribe to my newsletter on Medium.

--

--

Olalekan Elesin

Enterprise technologist with experience across technical leadership, architecture, cloud, machine learning, big-data and other cool stuff.