Posts Tagged ‘behavior’

Wiki Surveys for Social Science Research

Friday, June 10th, 2011

Surveys and interviews form the central methodology for analyzing and discovering attitudes and opinions in social science research. With the advent of Web, online surveys have become an efficient way for researchers to collect and analyze large amounts of data. The popularity of the online survey tools like SurveyMonkey , Zoomerang, SurveyGizmo , etc. are testament to the productivity enabled by surveys. However, surveys represent a top-down rigid methodology forcing the survey designer to account for all possible answers up front, which is an impossible feat. In contrast, interviews allow the unanticipated information to bubble up bottoms up from the respondents. For instance, Integrity Watch Afghanistan (IWA), Afghan Perceptions and Experiences with Corruption: A National Survey 2010 primary data, involves interviewing randomly selected 6,500 respondents in 32 provinces on over 100 questions that deal with sectors where people experienced corruption; levels of bribes people paid to obtain services; what type of access people had to essential services; who people trusted to combat corruption; and experiences with corruption in the judiciary, police, and land management. However, the interview methodology is expensive and time-consuming as it requires implementation by research companies with expertise in effective research design, and precise management of data collection over several months.

Is there an alternative to surveys and interviews in social science research? Prof. Salganik’s team at Princeton came up with a hybrid approach, “wiki surveys”, that combines the structure of a survey with the open-endedness of an interview. To date, various organizations have created more than 1000 wiki surveys on the project Web site – All Our Ideas, generating in 45,000 ideas with 2 million votes. Wiki surveys range from the New York City Mayor’s Office’s engagement with citizens in shaping the city’s long term sustainability plan to the Catholic Relief Services surveying their 4000 employees to find out what makes an ideal relief worker. The figure below shows how the third question in Tactical Conflict Assessment Planning Framework (TCAPF) would be be implemented as a wiki survey:

tcapf wiki survey.jpg

Inspired by extending the kittenwar concept to ideas, the user interface guides the respondent to choose between two random alternatives, while encouraging the respondents to add their ideas into the mix of alternative responses. The additional ideas are added into the survey’s marketplace and voted up or down by the other survey-takers. Prof. Salganik says that “One of the patterns we see consistently is that ideas that are uploaded by users sometimes score better than the best ideas that started it off. Because no matter how hard you try, there are just ideas out there that you don’t know.”

All Our ideas have some basic visualization features to make sense of the wiki survey responses. Here is the visualization for the responses – “What do you think the Digital Public Library of America (DPLA) should be like?”:

DPLA Survey Reponse.jpg

It is worth noting that the top scoring 15 ideas starting with DPLA interoperability with Government Printing Office (GPO), Defense Technical Information Center (DTIC), an National Records Archive Administration (NARA) are all uploaded ideas not in the original set of alternatives. A powerful argument for crowd sourcing!

Admittedly, we still need boots on the ground to collect TCAPF data in Afghanistan given the demographics of the people we want to reach. On the other hand, wiki surveys hold great potential in reaching the younger generation fueling the Arab spring and the like.

Bin Laden Hideout vs. Al Qaeda Training Manual

Sunday, May 8th, 2011

In our Building Intent project, we developed a geoprofiling algorithm that predicts the location of facilities that support adversary operations in the urban environment. Geoprofiling is a technique that is widely used in serial crime investigations. In our project, we researched and developed a building intent inference system based on terrorist preferences, building characteristics, and social network behavior. Our approach learns the utility function that the adversaries are using, and classifies and predicts the potential utility of a facility to the adversaries based on the derived metadata of each facility using influence networks.

For terrorist preferences, we have studied Military Studies in the Jihad Against the Tyrants: The Al-Qaeda Training Manual in order to find building use tactics that the adversary is training its recruits, and found a significant number of building use related tactics and procedures embodied in these manuals. In collaboration with the Terrorism Research Center in Fulbright College, University of Arkansas, we then studied the international terrorism cases in the American Terrorism Study, and found empirical evidence that shows the practice of terrorism manual tactics in the observed data. Based on these findings, we developed a baseline set of indicators for modeling building intent, and researched the likely causal connections among these variables. We then built extractors to derive a set of metadata for these indicators, and used machine learning algorithms to find the causal connections between the incidents or events and building attributes, and model parameters, and build classifiers based terrorist process preferences , building characteristics, and guilt by association data.

As shown in the figure below, our geoprofiling algorithm does a nice job in predicting the Japanese Red Army terrorist Yu Kikumura’s residence in New York based on the American Terrorism study. Here the blue markers signify police stations and white arrows signify the egress points. As shown in the figure, Yu Kikumura’s residence at 327 East 34th Street, NY is in the red hotspot area predicted by our algorithm. Avoiding police stations and ease of egress were two of the primary factors in Kikumura’s choice of housing. Not only is his apartment equidistant from the nearest police departments – all of which are over one kilometer away – it’s back-alley access road to the underground Queens Memorial Tunnel provides a quick get-away by car. In addition, the examination of the residence floor plan reveals that the apartment building had numerous staircases (one of which is private to the unit) to the basement level with a rear exit.

Japanese Red Army.png

The Al Qaeda Training Manual gives several instructions for renting a residence as shown in the table below. For instance, it is preferable to rent apartments on the first floor for ease of egress, avoid apartments near police stations and government buildings, and in isolated or deserted locations, rent in newly developed areas, and the like. In particular, the Al Qaeda Training Manual calls for the use if the following tactics in renting an apartment:

Al Qaeda Tactics.jpg

So how does the location of Bin Laden’s secret hideout in in Abbottabad follow the advice of the Al Qaeda Training Manual? Not that closely. Bin Laden clearly did not follow the tactics for selecting a ground floor location by living on the third floor, for avoiding police stations and government buildings by selecting a location near the Pakistan Military Academy, for finding an apartment in newly developed areas where people do not know each other by choosing a neighborhood with retired Army Generals, and for preparing ways of vacating the premises in case of a surprise attack by not building exit stairs. The only tactic that Bin Laden has used from the list above is avoiding an isolated location. One wonders if Bin Laden made a concerted effort to avoid his own tactical advice in order to thwart geoprofiling techniques. Perhaps another consideration that will need to be taken into account in future geoprofiling is the assistance from outside forces, given the possible connection to a support network that included elements of the Pakistani military or intelligence services in the Abottabad area.

Increased political violence in store for Italy and Czech Republic?

Monday, November 29th, 2010

In collaboration with our academic partners Prof. Cingranelli at the Political Science Department, SUNY Binghamton University and Profs. Sam Bell and Amanda Murdie at the Department of Political Science, Kansas State University, we developed a Domestic Political Violence Model that forecasts political violence levels five years into the future. The model enables policymakers, particularly in the COCOMs, to proactively plan for instances of increased domestic political violence, with implications for resource allocation and intelligence asset assignment. Our model uses the IDEA dataset for political event coding, plus numerous indicators from the CIRI Human Rights Dataset, Polity IV Dataset, World Bank, OECD, Correlates of War project, and Fearon and Laitin datasets. Here is our model’s forecast for 2010 – 2014 as a ranked list:

  1. Iran
  2. Sri Lanka
  3. Russia
  4. Georgia
  5. Israel
  6. Turkey
  7. Burundi
  8. Chad
  9. Honduras
  10. Czech Republic
  11. China
  12. Italy
  13. Colombia
  14. Ukraine
  15. Indonesia
  16. Malaysia
  17. Jordan
  18. Mexico
  19. Kenya
  20. South Africa
  21. Ireland
  22. Peru
  23. Chile
  24. Armenia
  25. Tunisia
  26. Democratic Republic of the Congo
  27. Belarus
  28. Argentina
  29. Albania
  30. Ecuador
  31. Sudan
  32. Austria
  33. Nigeria
  34. Syria
  35. Kyrgyz Republic
  36. Egypt
  37. Belgium

Using a regression model applied to a large number of drivers of conflict variables spanning numerous open source social science datasets, our model uses a novel Negative Residuals technique. Negative Residuals result from the model predicting higher levels of violence than actually experienced, indicating nation states that are pre-disposed to increasing levels of violence based on the presence of environmental conditions and drivers of conflict with demonstrated correlation with measured political violence. The residuals imply that these are states that we expect to observe increases in violence although not necessarily high levels of violence. So Iran and Sri Lanka are not expected to have the same level of violence but are expected to have the same magnitude increase in violence.

There some unexpected countries on our list like Czech Republic and Italy. Time will tell the accuracy of our model’s predictions although recent political violence in Ecuador is an early indicator of the model’s effective performance. The model uses nuanced measures of repression and captures variables that can be manipulated by policy makers. Our project page has further details on the model.

The Age of Assistants?

Wednesday, August 25th, 2010

Reading Norman Winarsky’s post on The Age of Assistants reminded me of the scene from the movie Dancing with the Wolves where the Sioux Chief asks Lieutenant Dunbar played by Kevin Costner:

You always ask about the white people. You always want to know how many more are coming. There will be a lot, my friend. More than can be counted.

How many?

Like the stars.

In a similar vein, Winarsky says: “And we likely won’t have just one assistant – we’ll have two or three or maybe even 10, a scalable, distributed cadre, an army, even – of Virtual Personal Assistants (VPAs) at our service.” I believed that in 1993 when we shipped the world’s first desktop assistant Open Sesame! for the Mac. Open Sesame! was a learning agent that observed a user’s interaction with the operating system GUI, found repetitive patterns and user preferences, and offered to automate repetitive tasks for the user:

Open Sesame Box.jpg Open Sesame Dialog Box.jpg

Open Sesame! was a relative success on the Mac. It was localized in Japan, shipped with every PowerPC in Taiwan, and got positive reviews in US, Italy, Germany, and beyond. In A Review and Analysis of Commercial User Modeling Servers for Personalization on the World Wide Web, Fink and Kobsa state: “Open Sesame can be considered an early pioneer of personalization, both in research and commercial environments. Despite its early market entry and its sophisticated features, there is, to the best of our knowledge, no commercial system on the market that is comparable to Learn Sesame.” In spite of the positive reviews, Open Sesame! could not escape the criticism of generating a new category of software – nagware. Alas Open Sesame! cannot even get credit for generating this category in Winarsky’s article, which bestows this honor to Microsoft’s brain dead Clippy.

We presented our analysis of the rich database of user feedback collected with Open Sesame! in our Applied Artificial Intelligence paper Learn Sesame – a Learning Agent Engine. While the users found event based learning useful, they found the monitored events and offered actions limited in scope, and stated their desire for improved agent communication and social skills. In the intervening 15 years, a lot changed in personal computing to make the conditions ripe for software assistants:

  • Most personal computer users have embraced direct manipulation in user experience. The idea of delegating a task to a software assistant, and waiting a couple of seconds, minutes, hours, … etc. will only work for boring periodic maintenance type of tasks. Both personal and server computers have now become so powerful, enabling the opportunity for offering instant responses to delegated tasks. That will change everything in that delegated tasks to software agents will perform like direct manipulation to a user, thus increasing adoption.
  • Thanks to mobile computing, users have become used to notifications on their smartphones. Agent notifications that seem like intrusive spam on the desktop are now welcomed by users on their phones. In other words, we are more open to intrusion on our smartphones as they entertain us while waiting for a 3-hour flight delay. So the mobile computing platforms will be more welcoming to software agents that notify their users on the delegated tasks for status updates, additional task clarification requests, and the like.
  • Social computing helped users embrace the notifications of changes in user’s social networks. The new generation of users cannot go a couple of seconds without clicking on a Facebook notification on their mobile app, which is essentially a notification agent for Facebook. In other words, Facebook is teaching users the value of notification, which was considered nagware intrusion on the desktop.
  • There is a decent amount of content for personalization. In 1998, using Open Sesame learning engine, we built eGenie – a personalized Web site that learned user interests, built user profiles and presented personalized content for new books, movies, TV shows, concerts, etc. Frankly, how personalized can be a movie preference that you share with millions of others? Not much. In contrast, the social media is now generating truly personal content like your friend’s Facebook updates, Delicious annotations … Personalization that can be performed by software assistants has a lot more value for the Long Tail.
  • Semantic technology is coming on strong. As your friends and colleagues generate more semantically tagged content using tags, forms, etc., it will make the job of personal assistants easier in filtering knowledge of import to users. Similarly, web services APIs , linked data, etc. are becoming mainstream, thus making it easy for your personal assistant to interact with these data and services in the cloud programmatically on your behalf.
  • When I first showed Open Sesame! to Don Norman at Apple, he asked: “What enabled this product to be built? Why now?” I replied: Apple Events enabled us to monitor user actions reliably, and instruct the OS to perform tasks with ease. My answer had some element of truth as trying to build the same assistant for Windows 95 proved to be an insurmountable task due the lack of support for high level recordable and scriptable events on this platform. Now that the Web browser is becoming the GUI of the operating system as we move more towards cloud computing, it is relieving personal assistants from the necessity of learning the legacy of desktop operating systems, and putting up with their changes.

At milcord, we are keeping the learning agent flame burning in our Commander’s Learning Agent project. Stay tuned.

Tribal Human Terrain of Afghanistan

Thursday, July 8th, 2010

Under the sponsorship of the OSD Human Social Culture Behavior (HSCB) program, we are developing a semantic wiki for Complex Operations. The envisioned operational impact of our effort is to foster collaboration and sharing of knowledge for whole-of-government approach, and to improve COIN/SSTR operations analysis and execution by focusing on population as center of gravity. The development of such a wiki presents several challenges that include the broad domain area of knowledge complex operations require, a large number of doctrine publications to wikify and semantify, several out of print key references, etc. With these challenges, we saw an opportunity to develop an open source culturepedia for Afghan and Pakistan human terrain as such knowledge is not aggregated and not readily available.

The Complex Operations wiki currently contains more than 1,000 articles on the various tribal dynamics and locational knowledge for the Afghanistan and Pakistan region, outlining tribal meta-knowledge such as the sub-groups, primary locations, traditional alliances, and traditional disputes of various groups to support situational awareness about the human terrain. Here is the wiki page for the covered Afghanistan Organizational Groups. We have created over 150 concept maps (an example shown below) to capture the knowledge about 1,000 ethnic groups, tribes, sub-tribes, clans within Afghanistan and Pakistan region to make this human terrain knowledge readily accessible to the complex operations practitioner.

tribal concept map.png

Tribal Tree in Afghanistan (click to view full-size)

Our use of a semantic wiki platform enables the representation of the human terrain knowledge as facts and relationships. For instance, the wiki page for the Achakzai tribal group lists the the known facts and relationships about this ethnic group both a human consumable form using semantic forms:

Achakzai Semantic Form.tiff

, and a machine consumable form as semantic RDF relationships:

Achakzai RDF.tiff

Factbox (click to view full-size)

By inspecting the semantic form, the reader can deduce that Achakzai is a sub-tribe of Zirak, which is a sub-tribe of the Durrani super-tribe, primarily located in the Chora and Khas Uruzgan districts, and traditionally have disputes with the Nurzai, Panjpai and Kakar tribes. The representation of this knowledge in a semantic wiki has the additional advantage for faceted browsing and answers engine queries. For instance, the semantic wiki can answer questions like “What are the tribes in Kandahar Province and their traditional disputes?” as a table which gets automatically updated every time a new tribe in this province is added to the wiki:
Tribes in Kandahar.tiff There are also several groups in Afghanistan that do not organize around tribal kinship ties, including Uzbeks, Tajiks, and Hazaras. In addition to tribal affiliation, social organizations such as solidarity groups – a group of people that acts as a single unit and organizes on the basis of some shared identity, and patronage networks – led by local warlord or khan – play an important role in understanding of the human terrain. Afghan and Pakistan human terrain and situational awareness knowledge base can be extended to include other populations of interest to the community, such as Yemen or Somalia.


FOCUS 2010 Human Social Cultural Behavior Modeling Conference

Sunday, August 9th, 2009

Last week we attended the OSD FOCUS 2010 Human Social Cultural Behavior (HSCB) Modeling Program Conference in Chantilly, VA. It was an oversubscribed event with more than 400 attendees reflecting the broad interest in this program. Plenary presentations stressed the need to focus on the needs of the warfighter as opposed to the needs of the research community. The importance of data especially at the province level was drilled over and over as as a true need. Several speakers urged the adoption of common meta-data for social science data sets. Another related modeling issue addressed was the difference between correlation (coincidence) vs. causal relationships.

Joe Watts of AGC lamented the lack of GIS support in the HSCB projects, and urged the development of HSCB map symbology for transition to the warfighter. Dr. Lisa Costa described the use of POET (Political, Operational, Economic and Technical) relationships in social network analysis, and the importance of developing a social radar for HSCB analysis. In his concluding remarks, Dr. Robert Foster challenged everyone to think about solutions for the professional training of HSCB domain.

We participated in a poster exhibition and gave three presentations: first, a general overview of our Semantic Wiki for Complex Operations project; second, the knowledge management needs of  the complex operations community; and, third, open source social science data repository for HSCB research. Our presentations were received well and generated several questions. In particular, a number of attendees expressed the suitability of our wiki supporting the training of complex operations professionals. There was also an interest in our social data repository supporting legacy data and automating the ingestion of new data.

There were too many interesting presentations to cover every one. Dr. Sean O’Brien gave an overview of his Conflict Modeling, Planning, and Outcomes Experimentation (COMPOEX) program. The precision and recall performance of the lead contractor Lockheed Martin’s system based on Ada boosting of multiple social behavior models was fairly impressive. Dr. Barry Silverman‘s agent based model CountrySim – one of the models in the Lockheed Martin system – was one of the models contributing to the high performance. In terms of data collection, Prof. Mansoor Moaddel‘s values and attitudes survey for Afghanistan, Iraq, Iran, Saudi Arabia, Jordan and Egypt was particularly interesting.

CDR Dylan Schmorrow put the state of the art in human social culture behavior models into perspective by comparing the HSCB models to models for weather and economic forecasting. The maturity of weather forecasting models is higher than those for economic forecasting. On this scale, CDR Schmorrow positioned HSCB modeling on the x-axis while hoping the HSCB program serving as a catalyst to ignite HSCB modeling into a steady march towards maturity.