|
|
Below is a presentation created by Theresa Sterling in March, 2010 about online privacy, and what she refers to as your “digital dossier” that begins to be created at birth. The presentation is reasonably factual, without being overly alarming, and may help connect the dots for some people who aren’t steeped in the issues.
Finally! It’s fantastic to see that I can now talk about what I do for a living with my friends and family. There’s nothing like a good popular culture comedy icon talking about your profession to catalyze the conversation. Now perhaps I won’t get blank stares when I tell them I work for ISOC along with my pals at the IETF, Kantara Initiative, and W3C on issues relating to online Identity and privacy…
One of the presentations at the Internet2 Advance CAMP included these photos referencing some major vendors (who recently merged). Very amusing.
 
… and it’s a great way to avoid running afoul of logo usage guidelines.
I’m often talking to people about how open Internet technologies enable emergent innovations… and Eric Fischer provided an excellent example of what you can do when you have free access to seemingly unrelated data sets.
To create this image of San Francisco (he’s currently posted 50 maps), he took the geo-tagged data from photos uploaded to Flickr and Picasa, then banged the locations against OpenStreetMap using Perl and Ghostscript to overlay travel vectors of the photographers. Specifically, he compared photos taken by the same photographer within 10 minutes and bounded by 3 miles to compute and plot their travel vector. The resulting map is color-coded to indicate black=walking (7mph), red=bicycling (19mph), blue=street vehicles (43mph), green=freeway vehicles or rapid transit (>43mph).
I’m not going to argue for/against the privacy issues embedded within geo-tagged photos. That’s a separate issue, but this does clearly illustrate that when people have free and open access to data, they’ll combine them in clever and unique ways to generate something entirely new (and potentially useful).
Provenance: I heard about this via a tweet from @PeteWright, read a blog post (including the comments by Eric explaining his process), and ended up at Eric’s Flickr page.
Recent news about intrusions into the online accounts of public figures like U.S. presidential candidate Sarah Palin and prominent companies like Twitter remind me of the not-too-distant past. These appeaer to be bellwether events pointing out that the general public is starting to realize the protection of their identity starts with what they can (and should) control. It sometimes takes high profile cases like this to energize action, a cycle that appears to repeat itself.
About 8 years ago I took on the challenge of securing the digital borders around the e-commerce systems for the Kraft Group’s sports properties. At that time, I could see a storm cloud gathering on the networked horizon as we built a system to unify all of the current properties and set the foundation to build out a series of interconnected portal communities. Looking forward, I knew that it was only a matter of time before a major press-worthy event would raise everyone’s awareness regarding the protection of user privacy, in the form of personally identifiable information (PII), and associated payment information.
Our business strategy was to build a core commerce engine that could handle online transactions embedded within each separate portal. Key to our success was enabling users to have a persistent identity throughout their engagement with our products. In this way we could minimize the barriers to their interacting with our content, as well as streamlining the purchase pipeline. Essentially, once users logged into any of our portals (to access premium/personalized content, manage accounts, and purchase products), we were able to effectively cater to them by simplifying their experience.
The problem with this single-sign-on model was that if a user account was compromised, the intruder could have free reign over the victim’s PII and associated payment information. I had to make the case for going the extra mile(s) by designing strict access control procedures, knowing that something bad was going to happen to a company soon and that we should be ahead of any reactionary solutions imposed upon us. I had a feeling that after some bad press, the e-commerce industry would be pressured to lock down the porous borders that were relatively common at the time.
Just such a case occurred in 2004 when hackers were able to access an estimated 8 million credit card numbers from BJ’s Wholesale Club. It took a few years for details of the incident to emerge, but it was clear even then that there were two primary issues: insecure access points, and poor audit logging. Regardless of whether it was an inside job (as was initially assumed) or an outside hack (which it turned out to be), BJ’s (among other compromised companies) had poor access control and monitoring.
This, as well as other similar incidents, prompted the creation of the Payment Card Industry Security Standards Council, founded in 2006 by American Express, Discover, JCB, MasterCard, and Visa. The payment card industry thus began requiring strict practices and controls around systems that perform above a modest threshold of transactions. It was a strong move, in advance of looming legislation, that helped steer wayward companies toward better practices. Regardless of the critiques of their programs, it has succeeded in shining a light on many problems needing to be addressed.
Fortunately, by the time the PCI guidelines hit the market, we were able to breeze through their audits. The commerce engine we’d built was tighter than what they required. It’s rare that you can so easily point to a situation like this where the extra capital cost on the front end so clearly saved money that would’ve been required to retrofit a running system.
Now, here’s where the history lesson circles around to become informative for current events. We should learn from these cases of identity intrusion and address the core issues. The obvious lesson is not to be cavalier regarding the protection of your email accounts. After all, they are your core identity asset in today’s online world. Be careful when setting up your email account and follow common sense when selecting passwords and associated “remind me” features.
Beyond what you can do for yourself today, the industry needs to step up it’s game, too. Fortunately, there are a number of efforts currently under way to help protect your identity. They just need to be more whole-heartedly embraced and helped to mature by the major players in the market. What’s uniquely interesting about many of the emerging solutions is that they’re user-centric, rather than being centered around any one company’s digital security practices. This focus helps solve the root problems: privacy protection starts at home, and it’s not a simple matter of more/better cyber-security and encryption.
For more information, and to become involved, I highly recommend following the open standards development relating to user-managed identity:
And, of course, the Internet Society Trust & Identity Initiative. Tell them I sent you.
I was recently in line at the first airport security checkpoint, waiting my turn for the TSA agent to allow me into the gate area. In front of me was a man who had just handed the agent his documents, and I was about to see an example of the human brain in action as a finely-tuned (and flexible) pattern matching machine and decision engine.
We’re all familiar with the airport security ceremony by now. You stand in line (fortunately they seem shorter these days) with your boarding pass and drivers license (or other government-issued identification card) in hand. From what I can tell, the TSA agent confirms that the ID appears to be valid and that the embedded photo resembles the person standing there.
While the agents use loupes and florescent lights on the IDs, very little validation of the boarding pass seems to take place. With the ability to print your own boarding pass at home, their vetting is definitely limited. Setting aside what they could do (e.g. each pass including a hashed string encoded as a barcode the TSA agent could scan), the boarding passes seem oddly useless.
Or that’s what I thought until I noticed the ceremony was taking just a beat longer than usual in this case. I don’t know how much longer it was taking, but for some reason I noticed the person wasn’t moving as quickly as I’d assume they should though the checkpoint. Glancing at the TSA agent, I saw that she was scrutinizing the boarding pass, then looking back at the passenger’s ID, into his face, then back to the boarding pass, her eyes darting all over it. All the while a slight frown of concentration was deepening on her face.
At this point, the passenger tried to lighten the mood by pointing to his ID and saying, “I know, the photo doesn’t look like me any more.” It’s obvious he was talking about how much he’d aged, but the TSA agent cocked her head to one side and immediately made a decision that there was something needing to be investigated before she’d let him pass.
She began asking the passenger questions about his flight, where he was going, and if he had a second ID. At this point the passenger started to sweat as he realized the situation seemed to be going pear shaped. He sputtered something about not having another ID and started patting his pockets (as if he’d find he’d accidentally slipped his passport into his jacket before leaving for the airport). Then the magic happened.
The passenger pulled a slip of paper from his pocket and stared at it for a second, smiled, and then chuckled. He’d found his real boarding pass for this flight. Apparently, the one he’d initially handed the TSA agent was for his return flight the next day. After handing over the correct boarding pass, the agent checked it and was visibly relieved, belying the fact that she was preparing herself for he worst (according, no doubt, to her training). She quickly performed the standard checks and let him pass, reaching out for my documents.
Oddly enough, during this particular trip I was reading the book “How We Decide” by Jonah Lehrer. There is a chapter in it about how a British radar operator accurately detected an incoming missile during the first Gulf War despite an apparent lack of hard evidence linking the incoming blip with a known threat.
This situation seemed similar in that the TSA agent couldn’t quite put her finger on the reason why she felt something was wrong with the passenger’s documents. She’d apparently seen enough boarding passes and IDs to have some type of ingrained sense of what patterns are right, and which are wrong. Since she had been given a valid boarding pass, with only a minor difference of a few characters, she wasn’t able to quickly home in on what specifically was wrong in this case. All she knew at that point was she had to slow things down and start probing until she was able to determine the correct course of action.
There are, of course, flaws to in the airport security system, but this experience was oddly reassuring. Until a more automated system is in place, this particular TSA agent was very good at what she does. Within what turned out to be less than a minute, she had detected a slight anomaly even though she couldn’t immediately identity what it was. She then escalated the situation smoothly and easily in a way that allowed her the time to work out what was wrong.
Sitting in a talk by Peter Neumann about “Identity and Trust in Context” at IDTrust 2009 he mentioned the use of attribute encryption within Attribute-Based Messaging (ABM). As I was unfamiliar with ABM, I found the following description from the paper “Using Attribute-Based Access Control to Enable Attribute-Based Messaging” by Rakesh Bobba, Omid Fatemieh, Fariba Khan, Carl A. Gunter, and Himanshu Khurana.:
Attribute-Based Messaging (ABM) is the concept of allowing lists of messaging recipients to be formed dynamically by using an attribute-based recipient address. This approach brings the flexibility of attributes in enabling the sender to send targeted messages, which 1) enhances the relevance of messages to the recipient and 2) allows the sender to send confidential messages knowing that the messages would be delivered only to the intended recipients.
Basically, what this means is that a user wanting to send a message to unknown recipients would run a query against a system so it was only sent to people who match the selected attributes. For example, I could use an ABM solution to send a survey of IETF participation to colleagues who are members of at least three IETF discussion lists.
I immediately thought that this is the type of solution that fits squarely in the sweet spot of the Semantic Web. I could easily see that if the attributes are encoded using RDF, an ABM system would seem to be an excellent use case leveraging SPARQL. Looking around, though, I can’t find anyone working on this approach.
Does anyone have any examples of or suggestions for this idea in practice?
Buried in a post about OpenID user experience by Chris Messina is a concise bit of advice for users:
picking an identity provider should be like picking a bank or credit card provider: as a fourth-party service provider that advocates for your interest, since you’re their customer!
The “fourth-party” reference is to an article titled “Get ready for ‘fourth party’ services” by Doc Searls in the Linux Journal.
Personally, I’m not a fan of the introduction of this term for the new party around the table. I like to think that a “third party” working on the user’s behalf fits the bill just fine. Following an object-oriented mindset, the third party can adopt the properties relating to it’s responsibility in a transaction without being locked between two others (necessitating a fourth).
What I do like, however, is the concept Chris clarifies later:
Instead of agreeing to terms of service that disclaim all responsibility to you, the customer, I hope that competition in the identity space will lead providers to actually take responsibility for their services — charging good money for doing so. If your account gets hacked — no problem! — your identity provider can put back the pieces and make things right again! You could even take out online identity insurance in case your identity is ever stolen — so you can always get back to your life and recover your data without the hassle and interruption when it happens today.
To unpack this a bit, I see a compelling use case for identity providers emerging, possibly piggy-backing on the PCI Security work. So far, the first quote about picking an IdP is falling on deaf ears as users don’t really think about their choice. They use what they use and that’s about as far as it goes. What users need is a compelling reason to think in terms of choice, and the model Chris suggests might be it.
I spent some time helping to build an affinity card system with MBNA a couple years ago, and that process was telling. As it relates to this discussion, I can easily see that they would jump on the opportunity to capture a market like this. All that needs to happen is for someone to write up a clear business plan around the concept. In fact, I’ll bet there’s an MBA student out there somewhere looking for their thesis.
In a nutshell, here’s what I think this looks like:
- Credit Card Company (C3) sets up a new product based on it’s current card-based account system.
- C3 stands up a full service identity provider (possibly built using the Higgins Identity Framework)
- For high value services, C3 executes federation agreements with key nodes.
- C3 contracts with an insurer to cover losses due to ID theft / masquerading (rates most likely locked to the NIST levels of assurance as codified by the Liberty Alliance Identity Assurance Framework).
- C3 then advertises the new product to it’s existing customers (ID validation fees waived as an incentive)
- Users now have a reason to choose C3 as their IdP for all high value applications (and might as well use them for everything else, too).
C3 still has to convince it’s customers (and attract new ones) to see value in paying for a secure IdP. I don’t believe this is too far away from happening organically, so now’s the time for a C3 to start working on the product line.
Further, it’s distinctly possible that Id end points are going to force the issue by requiring verified identity assurance and security beyond what your run-of-the-mill OP can provide. Hence services like MyID.is (which has it’s own issues, of course, but that’s the direction). If a C3 gets in the game, I have a feeling they’ll be able to build a more effective federation of trust, even when used in an anonymous context.
I recently had a brief Twitter exchange with @MarkHawker about the term Semantic Web. It started with his tweet:
Would love to see how all these “semantic web” applications are utilising the full SW stack with ontologies, trust and related technologies.
Quickly followed by:
All I fear is Semantic Web will go down same route as Web 2.0 definition. Needs to be clarity and understanding of underlying technology.
To which I responded with:
@markhawker With a lot going on in the SemWeb space that’s not strictly utilizing the “full stack”, I still see the movement as positive.
He followed up with:
@jtrentadams Agree movement positive as achieving full stack is one of toughest computing challenges. Though appreciation of stack needed.
And shortly thereafter with:
@jtrentadams Analogy of me having a steering wheel & engine & claiming to have a car. Devalues contributions in other areas of innovation.
Since I didn’t have a chance to respond quickly enough before the thread went stale (easy to do when I step away from the computer for more than a nanosecond), I thought I might as well follow it up here.
I’m not really one to be hung up on terms, so I don’t really mind the loose application of terms like “Web 2.0″. In my opinion, it’s just a moniker people can use as a placeholder for a grouping of technologies creating something more than what was originally rolled out in 1994. There are endless debates about what it really means, and I’m not sure anyone’s going to agree to a definition any time soon. Perhaps that’s a job best left to the historian class of 2050.
For sake of this post, assume that Web 1.0 was the “document web” where most links were essentially static. Naturally, what followed was an emerging desire to actively link resources in a way we could consider to be a more “dynamic web”. This more active type of linking opens the way for net-native applications and mashups we could call Web 2.0.
Regarding the term Semantic Web, I see it as a handler for something else again. We could just as easily call it Web 3.0, I guess, as some people do. What I see as the salient difference between the SemWeb and where we are today, however, is “context awareness”. Even in the dynamic linking we see around us today, what’s missing is connections being made due to inherent knowledge of and between the end points.
Returning to the thread with @MarkHawker, I see a major problem with the adoption of the SemWeb “technology stack” (eg. ontologies, RDF, SPARQL, etc.). Specifically, it’s that they’re currently a tough nut to roll on top of existing systems. That being said, I see nothing wrong with easing into them where appropriate to slowly begin to build traction.
In fact, if folks are using any SemWeb tech, I’m happy to hear them crowing about it. For example, if someone’s doing nothing more than using a triple store model for their data so they can move it around with RDF, I give them a SemWeb bonus point. Each step (no matter how trivial) we collectively make toward our end points being able to effectively communicate gets us that much closer to the goal.
Consider a company going to market saying they’re “Fully Semantic Web Enabled” and all they’ve done is add RDFa into their markup. If the market responds favorably to them, more cash will emerge to support further advancement across the board.
In the end, I’m much more interested in success stories around any of “the stack”, not waiting until someone implements “the full stack”. The fully-realized SemWeb is going to grow organically, and I doubt we’ll see a clear line dividing it from it’s predecessors.
The deeper I dive into the various projects I’m working on the more I encounter the substantive differences between logging into a system and having the appropriate roles to access what they need. In my rough-n-ready approach to definitions as a way to discuss concepts:
- Authentication: Identifying yourself using some type of credentialing system
- Authorization: Having the rights to access and/or modify something.
In our daily lives, we’re familiar with the need to authenticate. Whipping out your ATM card and entering your PIN is a common example of telling the banking system who you are (or at least the identification of your account). In the expanding universe of online services, everyone’s inundated with the standard challenge/response prompting the entry of username and password.
What’s not as often visible is what happens post-authentication. Under the covers, systems generally track what you’re allowed to do once you’ve said who you are. This is where authentication enters the story. After entering your PIN, you can pull some cash from your bank account, but your access is often limited. For example, you are only authorized to access up to $N.
This is all very well and good, but oddly enough it glosses over a fundamental issue at the core of these transactions. Specifically, when the authentication credentials are assigned, it’s not always clear the issuer truly knows the identity of the credentialed party.
This is where “assurance” comes into play. I had a great dinner conversation with Peter Alterman of the GSA who was able to shine a light on the subject. Before entering a system requiring authentication (and associated authorization), you’ll need to assert your identity to a level the managers of the system find acceptable. There is, of course, a complete gamut of acceptable levels as not all systems require the same assurance of their users’ identities.
These days when we’re creating new social networking accounts every five minutes, it’s pretty clear little to no assurance is given to the site that the name you entered when registering is valid. It’s arguable whether or not that’s a problem, but it’s not very controversial that signing up for Facebook has different needs than being credentialed to access your medical records.
Then again, that’s only true insofar as you’re going to stay within that system (or largely similar ones). What happens when you start cross-linking systems in a frenzy of mash-ups and other data portability goodness? Unfortunately, the discussion with Peter really opened my eyes to the host of issues that arise when a user creates a self-credentialing account, then wants to link data from it to a system he perceives as being more secure. It was in the context of discussing a healthcare initiative I’m working on where this problem really became obvious.
Taking my personal experience here, I’ve got what I believe to be the mac-daddy of OpenID accounts. I have been using myOpenID.com (and I firmly recommend it to anyone who will listen), and was thrilled when they rolled out their two (and three) factor identification methods. Now if I want to log into an OpenID Relying Parties, I need to present not just a username / password combination, but also provide an SSL certificate and answer my cell phone, too. Now if you want to access my accounts, you need all three bits: un/pw, SSL cert, and my cell phone.
Now, here’s the ugly secret. I set up that account myself, and no one verified that I was who I said I am when I created my OpenID. Basically, you could call it “Assurance Level 0″ (i.e. no assurance of my identity whatsoever).
Who cares, you might ask rhetorically. Well, you should. While it’s fine-and-dandy for run-of-the-mill social networking systems, what happens when they start asking for your credit card number? We’re right back where we’ve been for decades. Yes, you logged into the system before entering your payment information, but when you roll back to how the account was created it’s not materially different than simply typing your credit card into a non-credentialed system.
So, basically, here’s where I am now. I’m working on a couple of projects that need to maintain a level of security throughout the process lifecycle. Apparently, I need to be much more careful how credentials are allocated, and it’s not enough to rely on self-credentialing. I’m looking into automated multi-factor identification assurance models now (akin to SSL certification methods used by VeriSign that use third-party triangulation).
If you’re curious about this assurance discussion, you might want to pop over to the Identity Assurance Framework released by the Liberty Alliance. You’ll also notice that Peter was a co-chair for some of the work that went into it. I guess I was talking to the right guy.
END NOTE: I wish I knew the origin of the saying, “to name something is to control it”. In my experience that’s nearly accurate in that once a (good) name is applied to something, it’s easier to discuss, and thus feel a sense that what’s been named is more effectively understood. That’s not necessarily true, of course, but at least agreeing on a common name is a good start. Now that I’ve got a handle on the term “assurance”, I have a sense I’ll be able to more effectively grapple with the concept.
|
|
Online Privacy and Your Digital Dossier
Below is a presentation created by Theresa Sterling in March, 2010 about online privacy, and what she refers to as your “digital dossier” that begins to be created at birth. The presentation is reasonably factual, without being overly alarming, and may help connect the dots for some people who aren’t steeped in the issues.
Privacy on the Web? on Prezi