This is Chris Diehl's Typepad Profile.
Join Typepad and start following Chris Diehl's activity
Join Now!
Already a member? Sign In
Chris Diehl
Recent Activity
Given the way the cyber landscape is unfolding, I hope you are right! It's a bit difficult to wage network-centric warfare when the network is gone...
Hi Eni, Thanks for sharing your thoughts. What struck me following the Sandy Hook shootings is how tightly people held onto their beliefs even in the face of clearly verifiable facts that ran counter to their particular thesis. This is why I'm much more interested in questions of persuasion here. What will be judged to be compelling enough to sway someone's viewpoint? No matter how much effort one goes through to produce some objective data on the state of the world, others will not be in a position to easily verify those efforts. And if there is the slightest room for questioning, I believe those results will be questioned. We see this all the time in discussion around high stakes issues. When thinking about persuasion, I often wonder about the effects of social context. We know beliefs cluster in social networks. And there is a cost for holding a minority view. If you are trying to persuade individuals without taking into consideration the social context they are embedded in, are you fighting a losing battle? One avenue where I'm slightly more bullish is citizen measurement of aspects of their community. What if we empower citizens with tools to report data about the state of their community and give them a stake in constructing that picture? If they are tied to the measurement process and curious about their community, I'm wondering if we have a better chance of achieving some change. This may still be a utopian dream but like yourself, I'm open to exploring options! Along the way, it's important though for us to keep a realistic picture in mind of what is in the realm of the possible.
Hi Mikio, As you well know, I wholeheartedly agree with your perspective. Our responses seem so heavily conditioned by our priors that it takes significant conscious effort to overcome them and widen our perspective. Coupling this selective attention with problems that are so complex and difficult to understand leaves us in a real quandary.
Thanks for your comment Sean! You raise a number of important points here. Given we know it's sometimes difficult to mitigate our own confirmation bias in our search behavior, I wonder if there is a way to change the search experience in such a way where the machine can help avoid that outcome. What would it mean to structure the query such that, in limited circumstances, the machine could recognize the search for confirmatory evidence? Not sure how this would work exactly. Even if it was possible, it raises concerns in my mind about the control it transfers to the machine in the shaping of our information landscape. Maybe one would argue we've already crossed the threshold in that regard. Even if the machine were to expose the user to reputable sources (however that is defined) with alternatives, would the user even be willing to admit those sources? Trust is a curious and fascinating phenomenon.
Hi Tim, Thanks for your comment. No doubt there is a place for statistical models and I'm not saying all is lost in general. My point is that for some of the most pressing questions we have in society today where we are asking questions about how to perturb complex systems (e.g. societies, nation states, economies) into more desirable states, we should not deceive ourselves into believing that we can predict their future state. There are systems that even under ideal observability conditions defy meaningful prediction. When you add in more realistic observability conditions, matters become even worse. Taleb talks about this issue in detail in his Edge essay entitled "The Fourth Quadrant." His point is that we need to be cognizant of the regimes in which our methods will provide meaningful results. In certain domains, the application of statistical methods will only lead us to false confidence. Without a doubt, questions remain in my own mind about operationalizing some of these ideas. I hope to work them out through more writing and discussion here on the blog. In machine learning, we often conveniently sweep over the details of the foundational assumptions upon which our algorithms rest. The stationarity assumption being the main one. Very often, we are applying our methods to problems where the consequences of errors are not significant. With lives or $$ on the line, one would hope we would focus more on issues of risk, but that doesn't seem to come to pass often enough. I agree with your comment about using models to help us understand where we might get into trouble. Take agent-based modeling as a prime example. If ABM can aid planners in expanding their list of alternative futures, then I see that as a win. When they begin to place faith in their model and derive probabilities for those alternative futures, I begin to worry... a lot. The line is thin and many can't resist the temptation of crossing it. The case of Long Term Capital Management is a recent example of extreme faith in the model leading to significant consequences.
Hi Alain, Thanks for your comment. I only have a very cursory knowledge of the ILP domain and don't track the work ongoing there. I certainly welcome your thoughts / suggestions on how this class of algorithms can provide value in the EDA context. Chris
Hi John, First off, thank you for your feedback. I think your suggestion about using probability distributions is a good one to take the description to a finer level of resolution. My initial goal is more modest in that the taxonomy is primarily meant to facilitate discussion as opposed to computational modeling. There are some details in this framework that are not as clear as I'd like them to be. The N-way label for example was meant to be a superset that includes 2-way as well. As you point out, I believe it is fair to say that the majority of such calls are still 2-way. So the N-way label can be misleading. Unfortunately I do not have the data to address the question you posed. I wholeheartedly agree that it would be very interesting to see what communication channels people utilize over the course of their day and how those patterns of usage vary.
Toggle Commented Sep 25, 2010 on Categorizing Communication Mediums at Chris Diehl
Hi Andrew, Thanks so much! Good to hear from you! I will have to check out the article you referenced. I find this area fascinating. So much to explore! Glad you found the references helpful. Those two papers were a window into a whole new body of work for me. Thanks for your comment. I look forward to sharing more thoughts and hearing your feedback! Cheers, Chris
Toggle Commented May 2, 2010 on Social Signaling and Language Use at Chris Diehl
Hi Max, It is more than simply a function of how many stop words are used, as some have significant positive weights and others significant negative weights. What I'm saying is that stop words form a significant fraction of the most discriminative terms as judged by the weight distribution in the learned ranker. To answer your second question, let me provide a few additional details. The ground truth for the experiments was derived from an Enron document detailing manager-subordinate relationships that existed over about a two year span of time. The ranker was trained to rank order relationships within a given ego network for an individual. The ego network for a given individual in this domain is the set of all communication relationships for that individual. Our dataset consisted of 43 ego networks of various sizes. We performed a leave-one-egonet-out cross-validation experiment to test the generalization performance of the system. What we have found is that on average the relevant relationship is ranked 2nd. In terms of message ranking performance, our results are qualitative at this point. We have used the tool extensively to examine the network surrounding Tim Belden, one of the key actors in the Enron scandal. Rarely have we read more than three messages per relationship to make a judgement. We've seen compelling traffic time and again nominated by the system. Would we like to test these ideas on other datasets? Absolutely. Email unfortunately is tough to come by. And ground truth is usually unavailable. One of the things that makes our investigation unique is the quantitative assessment. More data would always help. We were skeptical in the beginning as well. Discovering the numerous other language studies done by folks in social psychology gave us more confidence that this phenomena is indeed real. For more details, see our original AAAI 07 paper.
Toggle Commented Apr 30, 2010 on Social Signaling and Language Use at Chris Diehl
Hi Paul, Whether or not modern IR systems can handle stop words is not the point. The simple point I wanted to illustrate here is that function words are important features when we are attempting to address social relationship identification tasks. Function word patterns carry social signals. This may seem obvious to some; yet it still remains surprising to many more. I know I've been surprised by the many compelling messages we've found in the Enron corpus through our approach to relationship and message ranking.
Toggle Commented Apr 30, 2010 on Social Signaling and Language Use at Chris Diehl
Hi Ryan, I'm fairly certain that is not the case. Let me expand a bit on the ranker. The linear ranker is trained to rank communication relationships. The feature vector contains the word unigram counts for the set of messages in the relationship A -> B. That set includes messages in one direction only. If pure volume of communication provided good ranking performance, the learning algorithm could return a weight vector with all entries equal to some constant. Instead what we see is a clear preference for particular terms. In follow-on work, we replaced the L2 regularization term in the learning algorithm with an L1 term to induce sparsity and developed a method for automatically learning the level of regularization. The optimal solution for that loss yielded severe pruning of the term space. While I believe average email length could be a useful feature, I believe other features would need to be used in concert. The sensemaking aspect of this work is a critical constraint that drives the need for using the content. We have shown in previous work that you can solve the relationship ranking problem with features based on email traffic statistics. But that does not help at all with cueing the user to compelling evidence.
Toggle Commented Apr 30, 2010 on Social Signaling and Language Use at Chris Diehl
Chris Diehl is now following The Typepad Team
Mar 15, 2010
Hi Jessy, I didn't hear about that. Appreciate the pointer!
Toggle Commented Nov 17, 2009 on Neighborhood Networks at Chris Diehl
Hi Tessa, I have a friend as well who used to be a ham radio operator and mentioned their role in facilitating communication during disasters. It's not clear to me how that role is defined. Something I should investigate further. In order to facilitate self-organization in situations where first responders are overwhelmed or unavailable, I still believe we need a different communication paradigm. One that supports the construction and sharing of a common picture of the environment. One that also allows neighbors to coordinate their actions as needed. People are certainly working on software that addresses some of these challenges. In fact all the pieces are probably out there to make this a reality. Admittedly there are significant questions, in my mind at least, about how you prepare beforehand. I think we start by embracing the idea that government will not be there in all situations in a timely fashion. That ordinary citizens can and will play a crucial role in safety and security in certain scenarios. Airline safety is one where we do not openly admit that the passengers are still the first line of defense in a 9/11 type hijacking scenario. Government officials will only admit this under duress. That mindset would need to change.
Toggle Commented Nov 15, 2009 on Neighborhood Networks at Chris Diehl