How might we gather information from hard-to-access areas to prevent mass violence against civilians? read the brief
CrisisTracker: Real-time Social Media Curation
=== OPEN QUESTIONS ===
The proposed open-source system (see below) has been used in practice for conflict monitoring, but several research/design/development challenges remain. if you want to contribute, please post your thoughts below regarding the following questions.
I (Jakob) would also love to brainstorm about these challenges in a Skype call (jakob.rogstadius), in particular if you have decision making or analyst experience related to conflict monitoring or intervention.
What information actually leads to action in the domain of conflict monitoring and prevention?
Neither technology nor information automatically leads to action and there are several cases where atrocities during civil wars have been well known, but no action was taken to prevent them (e.g. Rwanda). I can also imagine cases where a summary of raw reports with limited context or explanation can actually trigger new violence. What information should a system like this provide to be meaningful and to lead to positive change?
Can open-access information management systems improve the safety of regular citizens or help them contribute to peace?
Conflict monitoring and atrocity prevention are traditionally approached from a top-down perspective. The role of information management systems targeted at expert analysts is well established,but can such system also be used to empower bottom-up efforts? Rather than discussing what information should be hidden from the public, is there any information that can help improve individual safety, or promote mindsets that lead to conflict reduction and long-term stability in conflict zones?
What decision making processes should be supported?
I am primarily a software engineer and I need to know more about the specific decisions that are made by decision makers in peacekeeping and conflict monitoring situations. What decisions need to be taken, when, and what information is required to make those decisions? This knowledge is extremely helpful to make design trade-offs and to prioritize different features in the system.
How can a crowd assist decision makers with meaningful analysis?
Evaluation of the system has shown that both volunteers who curate content and decision makers who wish to consume the information would prefer that volunteers work with more complex reasoning tasks, rather than just data annotation. What decisions are frequent and important enough that it would be efficient to offload the required data analysis to a skilled or semi-skilled crowd? How can volunteers sufficiently share their evidence, reasoning and conclusions with decision makers for their work to be trusted?
What quantitative indicators are needed?
Good quantitative indicators (things that can be measured numerically) are required to provide meaningful time series, and to rank content by 'importance'. However, when the raw data is social media content, it is very difficult to extract traditional quantitative indicators such as the number of affected unemployed women in rural areas. Other quantitative metrics such as the number of messages or the number of unique people discussing the event are readily available, but are far less meaningful. It's clear that some form of quantifier needs to be extracted, but what low-hanging fruit should we aim for to still provide helpful time series? A rough estimate of the number of people affected (1s, 10s, 100s, 1000s) per event? Number of new events of type X per day? Number of people discussing any event of type X per day?
What are the ethical implications of a system like this, in particular in conflict situations?
I believe sources are sufficiently protected, but do others agree with me? What if the system collects information that mostly benefits one side in the conflict? Are there any (new) risks that this tool introduces into decision making processes, or does this tool simply require the same skepticism as any other source?
=== CONCEPT DESCRIPTION - CRISIS TRACKER ===
During conflicts in recent years, online social media (mainly Twitter, Facebook and YouTube) has emerged as a means for conflict affected local populations to communicate their experiences to the world. With increasing technology adoption and free access to posted messages, online social media can now be used to leverage the reporting capacity of thousands or millions of people on the ground for large-scale real-time distributed sensing.
The Twitter microblogging service saw 500 million tweets being posted daily in October 2012, by over 200 million active users. Unlike for instance Facebook and SMS, the vast majority of these tweets is shared publicly and can be accessed in real-time though an application programming interface (API). The challenge however is sense-making. With so much content being generated, maintaining overview and history, and detecting patterns and actionable information, requires specialized information management tools.
CrisisTracker is an open-source online webplatform developed primarily by me during my PhD studies, which adds structureto millions of reports already available on Twitter. This additional layer ofstructure helps reduce information overload, making it much easier to use socialmedia as a rich source for real-time situational awareness.
CrisisTracker infers structure by makinguse of the repetition that occurs when multiple people independently reportimpactful events, in two ways. First, the greater the number of people thattalk about an event, the more likely that event is to be of interest to asystem user. This is not a perfect indicator,but with far more information being collected than what can be consumed, havingsuch a metric is critical. Second, the CrisisTracker platform uses an automatedreal-time clustering algorithm to group together tweets that are textually verysimilar. A cluster of messages (a “story”) typically refers to a singlewell-defined event, such as an attack on a protected object, artillery shellingof a location, a bombing, etc. Although individual tweets are both extremelybrief (up to 140 characters) and difficult to verify independently, stories inCrisisTracker capture the event from multiple viewpoints and provide areal-time index of published evidence in the form of images, video and newsarticles.
After reports have been clustered, theplatform uses crowdsourcing techniques to extract structured meta-data (type ofevent, geographic location and named entities) from the stories, which improvesthe quality of search and filtering in the system.
16 Evaluations Evaluation results
How scalable would this idea be across regions and cultures?
|Looks like it’d be easy to spread across multiple regions and cultures|
|This idea could scale but it might need further iteration to make it widely relevant|
|Seems that this idea would best be suited for a single region/population|
Would a lot of resources be required to create a pilot for this idea? (think time, capacity, money, etc)
|This idea looks easy to pilot with minimal resources being invested|
|Feels like this idea could take a moderate amount of resources to pilot|
|Seems like piloting this idea would take a lot of resources|
How suitable is this idea for various challenges on the ground such as lack of internet or mobile access?
|Yep, it feels like it could work easily beyond internet or mobile access|
|Not so sure – it looks like it would require online or mobile connectivity|
|This idea definitely seems to rely on internet or mobile access|
Could this idea put users or others at risk?
|Nope, it looks like everyone would be safe|
|There are some potential concerns, but these could be addressed with further iteration|
|I can imagine some people being put at risk with this idea|
Overall, how do you feel about this concept?
|This idea rocked my world|
|I liked it but preferred others|
|It didn't get me overly excited|