This is Chris Diehl's Typepad Profile.
Join Typepad and start following Chris Diehl's activity
Chris Diehl
Recent Activity
Given the way the cyber landscape is unfolding, I hope you are right! It's a bit difficult to wage network-centric warfare when the network is gone...
Illuminating the Social Infrastructure
"We shape the world by the questions we ask" - John Wheeler Recently I participated in The City Resilient, an event organized by PopTech and the Rockefeller Foundation to bring together individuals with diverse backgrounds and experience that share a common interest in advancing the resilience o...
Hi Eni,
Thanks for sharing your thoughts. What struck me following the Sandy Hook shootings is how tightly people held onto their beliefs even in the face of clearly verifiable facts that ran counter to their particular thesis. This is why I'm much more interested in questions of persuasion here. What will be judged to be compelling enough to sway someone's viewpoint? No matter how much effort one goes through to produce some objective data on the state of the world, others will not be in a position to easily verify those efforts. And if there is the slightest room for questioning, I believe those results will be questioned. We see this all the time in discussion around high stakes issues.
When thinking about persuasion, I often wonder about the effects of social context. We know beliefs cluster in social networks. And there is a cost for holding a minority view. If you are trying to persuade individuals without taking into consideration the social context they are embedded in, are you fighting a losing battle?
One avenue where I'm slightly more bullish is citizen measurement of aspects of their community. What if we empower citizens with tools to report data about the state of their community and give them a stake in constructing that picture? If they are tied to the measurement process and curious about their community, I'm wondering if we have a better chance of achieving some change. This may still be a utopian dream but like yourself, I'm open to exploring options! Along the way, it's important though for us to keep a realistic picture in mind of what is in the realm of the possible.
Bias and Self-Interest: Timeless Constraints in the Era of Open Data
Senseless tragedies such as the recent Sandy Hook school shooting in Newtown, Connecticut give us pause as a nation and leave us struggling to understand how these acts happen all too often. With each loss, I begin to wonder if we are learning as a society. Are we truly reflecting on the problem...
Hi Mikio,
As you well know, I wholeheartedly agree with your perspective. Our responses seem so heavily conditioned by our priors that it takes significant conscious effort to overcome them and widen our perspective. Coupling this selective attention with problems that are so complex and difficult to understand leaves us in a real quandary.
Bias and Self-Interest: Timeless Constraints in the Era of Open Data
Senseless tragedies such as the recent Sandy Hook school shooting in Newtown, Connecticut give us pause as a nation and leave us struggling to understand how these acts happen all too often. With each loss, I begin to wonder if we are learning as a society. Are we truly reflecting on the problem...
Thanks for your comment Sean! You raise a number of important points here. Given we know it's sometimes difficult to mitigate our own confirmation bias in our search behavior, I wonder if there is a way to change the search experience in such a way where the machine can help avoid that outcome. What would it mean to structure the query such that, in limited circumstances, the machine could recognize the search for confirmatory evidence? Not sure how this would work exactly. Even if it was possible, it raises concerns in my mind about the control it transfers to the machine in the shaping of our information landscape. Maybe one would argue we've already crossed the threshold in that regard. Even if the machine were to expose the user to reputable sources (however that is defined) with alternatives, would the user even be willing to admit those sources? Trust is a curious and fascinating phenomenon.
Bias and Self-Interest: Timeless Constraints in the Era of Open Data
Senseless tragedies such as the recent Sandy Hook school shooting in Newtown, Connecticut give us pause as a nation and leave us struggling to understand how these acts happen all too often. With each loss, I begin to wonder if we are learning as a society. Are we truly reflecting on the problem...
Hi Tim,
Thanks for your comment. No doubt there is a place for statistical models and I'm not saying all is lost in general. My point is that for some of the most pressing questions we have in society today where we are asking questions about how to perturb complex systems (e.g. societies, nation states, economies) into more desirable states, we should not deceive ourselves into believing that we can predict their future state. There are systems that even under ideal observability conditions defy meaningful prediction. When you add in more realistic observability conditions, matters become even worse.
Taleb talks about this issue in detail in his Edge essay entitled "The Fourth Quadrant." His point is that we need to be cognizant of the regimes in which our methods will provide meaningful results. In certain domains, the application of statistical methods will only lead us to false confidence.
Without a doubt, questions remain in my own mind about operationalizing some of these ideas. I hope to work them out through more writing and discussion here on the blog. In machine learning, we often conveniently sweep over the details of the foundational assumptions upon which our algorithms rest. The stationarity assumption being the main one. Very often, we are applying our methods to problems where the consequences of errors are not significant. With lives or $$ on the line, one would hope we would focus more on issues of risk, but that doesn't seem to come to pass often enough.
I agree with your comment about using models to help us understand where we might get into trouble. Take agent-based modeling as a prime example. If ABM can aid planners in expanding their list of alternative futures, then I see that as a win. When they begin to place faith in their model and derive probabilities for those alternative futures, I begin to worry... a lot. The line is thin and many can't resist the temptation of crossing it.
The case of Long Term Capital Management is a recent example of extreme faith in the model leading to significant consequences.
Optimizing Organizational Performance in an Uncertain World - Part 1 - The Limits of Prediction
How can technology enhance the performance of organizations? How does one characterize organizational performance in a world where our ability to predict is highly constrained? These are two general questions that have been on my mind for some time now. In an effort to clarify my thoughts on the...
Hi Alain,
Thanks for your comment. I only have a very cursory knowledge of the ILP domain and don't track the work ongoing there. I certainly welcome your thoughts / suggestions on how this class of algorithms can provide value in the EDA context.
Chris
Exploring Complex, Dynamic Graph Data - Part 1
When I was in graduate school in the late 90s working on computer vision problems, MATLAB was my environment of choice. It provided all the tools I required for analysis and then some. Life was simple and grand. Five years after leaving graduate school, I decided to take a dramatic turn in my r...
Hi John,
First off, thank you for your feedback.
I think your suggestion about using probability distributions is a good one to take the description to a finer level of resolution. My initial goal is more modest in that the taxonomy is primarily meant to facilitate discussion as opposed to computational modeling. There are some details in this framework that are not as clear as I'd like them to be. The N-way label for example was meant to be a superset that includes 2-way as well. As you point out, I believe it is fair to say that the majority of such calls are still 2-way. So the N-way label can be misleading.
Unfortunately I do not have the data to address the question you posed. I wholeheartedly agree that it would be very interesting to see what communication channels people utilize over the course of their day and how those patterns of usage vary.
Categorizing Communication Mediums
When thinking about social dynamics in online social systems, it is important to consider the attributes of the communication medium that bound the set of permissible interactions. This week I've been revisiting a set of dimensions along which to characterize communication mediums in general. He...
Hi Andrew,
Thanks so much! Good to hear from you! I will have to check out the article you referenced. I find this area fascinating. So much to explore!
Glad you found the references helpful. Those two papers were a window into a whole new body of work for me.
Thanks for your comment. I look forward to sharing more thoughts and hearing your feedback!
Cheers, Chris
Social Signaling and Language Use
Consider the following email from Tim Belden to John Lavorato and Louise Kitchen. Recall that Tim Belden was indicted for his role in conceiving and executing various financial schemes that allowed Enron to profit significantly from energy markets. Through the analysis of the Enron email collect...
Hi Max,
It is more than simply a function of how many stop words are used, as some have significant positive weights and others significant negative weights. What I'm saying is that stop words form a significant fraction of the most discriminative terms as judged by the weight distribution in the learned ranker.
To answer your second question, let me provide a few additional details. The ground truth for the experiments was derived from an Enron document detailing manager-subordinate relationships that existed over about a two year span of time. The ranker was trained to rank order relationships within a given ego network for an individual. The ego network for a given individual in this domain is the set of all communication relationships for that individual. Our dataset consisted of 43 ego networks of various sizes. We performed a leave-one-egonet-out cross-validation experiment to test the generalization performance of the system. What we have found is that on average the relevant relationship is ranked 2nd.
In terms of message ranking performance, our results are qualitative at this point. We have used the tool extensively to examine the network surrounding Tim Belden, one of the key actors in the Enron scandal. Rarely have we read more than three messages per relationship to make a judgement. We've seen compelling traffic time and again nominated by the system.
Would we like to test these ideas on other datasets? Absolutely. Email unfortunately is tough to come by. And ground truth is usually unavailable. One of the things that makes our investigation unique is the quantitative assessment.
More data would always help. We were skeptical in the beginning as well. Discovering the numerous other language studies done by folks in social psychology gave us more confidence that this phenomena is indeed real.
For more details, see our original AAAI 07 paper. http://www.cpdiehl.org/PDF/aaai07-final.pdf
Social Signaling and Language Use
Consider the following email from Tim Belden to John Lavorato and Louise Kitchen. Recall that Tim Belden was indicted for his role in conceiving and executing various financial schemes that allowed Enron to profit significantly from energy markets. Through the analysis of the Enron email collect...
Hi Paul,
Whether or not modern IR systems can handle stop words is not the point. The simple point I wanted to illustrate here is that function words are important features when we are attempting to address social relationship identification tasks. Function word patterns carry social signals. This may seem obvious to some; yet it still remains surprising to many more. I know I've been surprised by the many compelling messages we've found in the Enron corpus through our approach to relationship and message ranking.
Social Signaling and Language Use
Consider the following email from Tim Belden to John Lavorato and Louise Kitchen. Recall that Tim Belden was indicted for his role in conceiving and executing various financial schemes that allowed Enron to profit significantly from energy markets. Through the analysis of the Enron email collect...
Hi Ryan,
I'm fairly certain that is not the case. Let me expand a bit on the ranker. The linear ranker is trained to rank communication relationships. The feature vector contains the word unigram counts for the set of messages in the relationship A -> B. That set includes messages in one direction only. If pure volume of communication provided good ranking performance, the learning algorithm could return a weight vector with all entries equal to some constant. Instead what we see is a clear preference for particular terms. In follow-on work, we replaced the L2 regularization term in the learning algorithm with an L1 term to induce sparsity and developed a method for automatically learning the level of regularization. The optimal solution for that loss yielded severe pruning of the term space.
While I believe average email length could be a useful feature, I believe other features would need to be used in concert. The sensemaking aspect of this work is a critical constraint that drives the need for using the content. We have shown in previous work that you can solve the relationship ranking problem with features based on email traffic statistics. But that does not help at all with cueing the user to compelling evidence.
Social Signaling and Language Use
Consider the following email from Tim Belden to John Lavorato and Louise Kitchen. Recall that Tim Belden was indicted for his role in conceiving and executing various financial schemes that allowed Enron to profit significantly from energy markets. Through the analysis of the Enron email collect...
Chris Diehl is now following The Typepad Team
Mar 15, 2010
Hi Jessy, I didn't hear about that. Appreciate the pointer!
Neighborhood Networks
Not long after I moved into my new apartment, I experienced three separate power outages of significant duration within the span of 24 hours. Since my iPhone was charged and AT&T's network remained operational, I was able to connect with the outside world and inquire about the situation. Given...
Hi Tessa,
I have a friend as well who used to be a ham radio operator and mentioned their role in facilitating communication during disasters. It's not clear to me how that role is defined. Something I should investigate further.
In order to facilitate self-organization in situations where first responders are overwhelmed or unavailable, I still believe we need a different communication paradigm. One that supports the construction and sharing of a common picture of the environment. One that also allows neighbors to coordinate their actions as needed. People are certainly working on software that addresses some of these challenges. In fact all the pieces are probably out there to make this a reality.
Admittedly there are significant questions, in my mind at least, about how you prepare beforehand. I think we start by embracing the idea that government will not be there in all situations in a timely fashion. That ordinary citizens can and will play a crucial role in safety and security in certain scenarios. Airline safety is one where we do not openly admit that the passengers are still the first line of defense in a 9/11 type hijacking scenario. Government officials will only admit this under duress. That mindset would need to change.
Neighborhood Networks
Not long after I moved into my new apartment, I experienced three separate power outages of significant duration within the span of 24 hours. Since my iPhone was charged and AT&T's network remained operational, I was able to connect with the outside world and inquire about the situation. Given...
Subscribe to Chris Diehl’s Recent Activity