Creating Dynamic Social Network Models from Sensor Data : Creating Dynamic Social Network Models from Sensor Data Tanzeem Choudhury Intel Research / Affiliate Faculty CSE
Dieter Fox
Henry Kautz CSE
James Kitts Sociology
Slide2 : What are we doing?
Why are we doing it?
How are we doing it?
Social Network Analysis : Social Network Analysis Work across the social & physical sciences is increasingly studying the structure of human interaction
1967 – Stanley Milgram – 6 degrees of separation
1973 – Mark Granovetter – strength of weak ties
1977 –International Network for Social Network Analysis
1992 – Ronald Burt – structural holes: the social structure of competition
1998 – Watts & Strogatz – small world graphs
Social Networks : Social Networks Social networks are naturally represented and analyzed as graphs
Example Network Properties : Example Network Properties Degree of a node
Eigenvector centrality
global importance of a node
Average clustering coefficient
degree to which graph decomposes into cliques
Structural holes
opportunities for gain by bridging disconnected subgraphs
Applications : Applications Many practical applications
Business – discovering organizational bottlenecks
Health – modeling spread of communicable diseases
Architecture & urban planning – designing spaces that support human interaction
Education – understanding impact of peer group on educational advancement
Much recent theory on finding random graph models that fit empirical data
The Data Problem : The Data Problem Traditionally data comes from manual surveys of people’s recollections
Very hard to gather
Questionable accuracy
Few published data sets
Almost no longitudinal (dynamic) data
1990’s – social network studies based on electronic communication
Social Network Analysis of Email : Social Network Analysis of Email Science, 6 Jan 2006
Limits of E-Data : Limits of E-Data Email data is cheap and accurate, but misses
Face-to-face speech – the vast majority of human interaction, especially complex communication
The physical context of communication – useless for studying the relationship between environment and interaction
Can we gather data on face to face communication automatically?
Research Goal : Research Goal Demonstrate that we can…
Model social network dynamics by gathering large amounts of rich face-to-face interaction data automatically
using wearable sensors
combined with statistical machine learning techniques
Find simple and robust measures derived from sensor data
that are indicative of people’s roles and relationships
that capture the connections between physical environment and network dynamics
Questions we want to investigate: : Questions we want to investigate: Changes in social networks over time:
How do interaction patterns dynamically relate to structural position in the network?
Why do people sharing relationships tend to be similar?
Can one predict formation or break-up of communities?
Effect of location on social networks
What are the spatio-temporal distributions of interactions?
How do locations serve as hubs and bridges?
Can we predict the popularity of a particular location?
Support : Support Human and Social Dynamics – one of five new priority areas for NSF
$800K award to UW / Intel / Georgia Tech team
Intel at no-cost
Intel Research donating hardware and internships
Leveraging work on sensors & localization from other NSF & DARPA projects
Procedure : Procedure Test group
32 first-year incoming CSE graduate students
Units worn 5 working days each month
Collect data over one year
Units record
Wi-Fi signal strength, to determine location
Audio features adequate to determine when conversation is occurring
Subjects answer short monthly survey
Selective ground truth on # of interactions
Research interests
All data stored securely
Indexed by code number assigned to each subject
Privacy : Privacy UW Human Subjects Division approved procedures after 6 months of review and revisions
Major concern was privacy, addressed by
Procedure for recording audio features without recording conversational content
Procedures for handling data afterwards
Data Collection : Data Collection Intel Multi-Modal Sensor Board Real-time audio feature extraction audio features WiFi strength Coded
Database code identifier
Data Collection : Data Collection Multi-sensor board sends sensor data stream to iPAQ
iPAQ computes audio features and WiFi node identifiers and signal strength
iPAQ writes audio and WiFi features to SD card
Each day, subject uploads data using his or her code number to the coded data base
Older Procedure : Older Procedure Because the real-time feature extraction software was not finished in time, the Autumn 2005 data collections used a different process (also approved)
Raw data was encrypted on the SD card
The upload program simultaneously unencrypted and extracted features
Only the features were uploaded
Speech Detection : Speech Detection From the audio signal, we want to extract features that can be used to determine
Speech segments
Number of different participants (but not identity of participants)
Turn-taking style
Rate of conversation (fast versus slow speech)
But the features must not allow the audio to be reconstructed!
Speech Production : Speech Production Fundamental frequency (F0/pitch) and formant frequencies (F1, F2 …) are the
most important components for speech synthesis The source-filter Model
Speech Production : Speech Production Voiced sounds: Fundamental frequency (i.e. harmonic structure) and energy in lower frequency component
Un-voiced sounds: No fundamental frequency and energy focused in higher frequencies
Our approach: Detect speech by reliably detecting voiced regions
We do not extract or store any formant information. At least three formants are required to produce intelligible speech*
* 1. Donovan, R. (1996). Trainable Speech Synthesis. PhD Thesis. Cambridge University
2. O’Saughnessy, D. (1987). Speech Communication – Human and Machine,
Addison-Wesley.
Goal: Reliably Detect Voiced Chunks in Audio Stream : Goal: Reliably Detect Voiced Chunks in Audio Stream
Speech Features Computed : Speech Features Computed Spectral entropy
Relative spectral entropy
Total energy
Energy below 2kHz (low frequencies)
Autocorrelation peak values and number of peaks
High order MEL frequency cepstral coefficients
Features used: Autocorrelation : Features used: Autocorrelation Autocorrelation of (a) un-voiced frame and (b) voiced frame.
Voiced chunks have higher non-initial autocorrelation peak and fewer number of peaks
(a) (b)
Features used: Spectral Entropy : Features used: Spectral Entropy FFT magnitude of (a) un-voiced frame and (b) voiced frame.
Voiced chunks have lower entropy than un-voiced chunks, because voiced chunks have more structure
Features used: Energy : Features used: Energy
Energy in voiced chunks is concentrated in the lower frequencies
Higher order MEL cepstral coefficients contain pitch (F0) information.
The lower order coefficients are NOT stored
Segmenting Speech Regions : Segmenting Speech Regions
Attributes Useful for Inferring Interaction : Attributes Useful for Inferring Interaction Attributes that can be reliably extracted from sensors:
Total number of interactions between people
Conversation styles – e.g. turn-taking, energy-level
Location where interactions take place – e.g. office, lobby etc.
Daily schedule of individuals – e.g. early birds, late nighters
Locations : Locations Wi-Fi signal strength can be used to determine the approximate location of each speech event
5 meter accuracy
Location computation done off-line
Raw locations are converted to nodes in a coarse topological map before further analysis
Topological Location Map : Topological Location Map Nodes in map are identified by area types
Hallway
Breakout area
Meeting room
Faculty office
Student office
Detected conversations are associated with their area type
Social Network Model : Social Network Model Nodes
Subjects (wearing sensors, have given consent)
Public places (e.g., particular break out area)
Regions of private locations (e.g., hallway of faculty offices)
Instances of conversations
Edges
Between subjects and conversations
Between places or regions and conversations
Non-instrumented Subjects : Non-instrumented Subjects We may recruit additional subjects who do not wear sensors
Such subjects would allow us to infer information about their behavior indirectly, and to appear (coded) as a node in our network model
E.g., based on their particular office locations
Only people who have provided written consent appear as entities in our network models
Disabling Sensor Units : Disabling Sensor Units As a courtesy, subjects will disable their units in particular classrooms or offices
Access to the Data : Access to the Data Publications about this project will include summary statistics about the social network, e.g.:
Clustering coefficient
Motifs (temporal patterns)
We will not release the actual graph
This is prohibited by our HSD approval
We welcome additional collaborators