This is Scott Waterhouse's Typepad Profile.
Join Typepad and start following Scott Waterhouse's activity
Join Now!
Already a member? Sign In
Scott Waterhouse
Interests: music, travel, motorcycles, cycling, recovery, and movies. backup, and gadgets.
Recent Activity
Mr. Appliance; Data integrity on the Data Domain Archiver is provided by the same exceptionally sound architecture as a regular Data Domain appliance: the Data Invulnerability Architecture. All data is protected in multiple ways, including RAID and data at rest consistency checking, is self-healing, does not propogate errors, and is protected by multiple levels of hashing. The data integrity features of the Data Domain architecture are amongst the strongest and most secure of any storage system of any kind.
Toggle Commented Aug 31, 2011 on Data Domain Archiving at The Backup Blog
Tina; the deduplication ratios should be roughly the same between the two methods. Certainly there is nothing so significant that you would ever choose one over the other for this reason.
Scott Waterhouse is now following The Typepad Team
Mar 15, 2010
Daniel; The situation you speak of may well be the case--and this is an ideal use case for target deduplication. Avamar may still be appropriate but there are a host of issues to consider. As an interesting aside, most database backups with deduplication default to a fixed block deduplication of 8 kb, because that is how the size of a database field in most cases anyway. So it turns out to be more efficient to do this. On the other hand, we still achieve similar net deduplication ratios to the variable length dedup that I discussed above (in part due to how well databases compress, and assuming that we are talking about a database with an average change rate).
Peter; Well, I am not sure how you could have a missing file but not even know where it is from? Having said that, I can search for it in Avamar (assuming I know the name... if you don't know that and you don't know where it is from how do you know it is missing?). With Avamar it would take about 2 seconds to recover--there is no penalty for an "incremental" because there is no such thing as an incremental with Avamar. It is possible, and no, conceptually, there is no difference (just the seek/load time that makes tape so awful to deal with in the first place).
Paul; This is on my (medium sized) to do list for the blog. I will do my best to cover this in the next few weeks. Scott
Toggle Commented Jan 19, 2010 on Avamar v5.0 at The Backup Blog
Steve; I agree with everything you said. However, my (very) unscientific survey says: backup admins like guest level, VMware admins like image level. And as a backup guy I often have to articulate to VMware admins why image level has some issues.
Madhav; It is an excellent question. A couple other smart people have asked me this as well, and I haven't posted up anything yet because I don't have a definitive answer yet. There seems to be a discrepancy between what I get from some sources vs. others. I am actively pursuing this and will post something definitive when I think I have that definitive answer. Stay tuned!
You can find me at scott dot waterhouse at gmail dot com or at my linkedin profile: http://ca.linkedin.com/in/sjwaterhouse Send me a note at gmail if you want my EMC address (or you can figure it out easily if you know our standard format of first name underscore last name at emc dot com). With that kind of change rate your options are limited, unfortunately. It would be interesting to explore hosting a DD box at your DR provider site to see if that would be any less expensive.
Toggle Commented Dec 24, 2009 on Avamar 101 at The Backup Blog
I would definitely advise going through a new sizing exercise. Bear in mind that the new Avamar nodes are of a different physical size, and you want to work with your sizer to ensure they are sizing based on the older 2 TB nodes. As far as databases go, it depends on what you consider large! ;) Generically, anything under 1 TB or so is fine (with a possible exception of Domino servers which seem to generate exceptionally high change rates). Anything between 1-2 TB should be carefully considered. What is the change rate? What is the tolerance of the host to a backup process? Can you run a proxy server? Is it a VM or a physical system? Anything above 2 TB may be OK, but would almost certainly require a proxy. Another huge generalization: you are probably only going to do this if this is the last thing you have that you want to put on Avamar--i.e. doing this means you can turn off your traditional backup. You said you had NetWorker, so your other strategy might be to run NW + Data Domain systems for databases and high change rate large size datasets, and Avamar for the remainder (remote, smaller, VMware, etc.). If you have other questions just post them up, and if they seem common to me I will address them in a separate post.
Toggle Commented Dec 23, 2009 on Avamar 101 at The Backup Blog
Frank; Sizing can be a bit of an art. EMC has a tool that can accurately size an environment, and I would advise you try to get your Avamar provider or EMC to use it for your environment. To size accurate you need to account for commonality across platforms, change rate, retention times, amount of source data, and so on. You can get an estimate by using a dedup calculator (like the one I link too) but that doesnt take into account commonality across Avamar clients, and doesnt size for a grid... The sizing tool really is the best way. Assuming it is not grossly more than you require, I usually recommend starting with a DS5 or DS6 (5 or 6 node grid) as upgrades from those configurations follow an easier, less disruptive path than upgrades from single/dual node configurations.
Toggle Commented Dec 23, 2009 on Avamar 101 at The Backup Blog
Jesper; I haven't seen any good numbers for a few years now. For what its worth, the consensus seems to be: Symantec at 45-50% (including NBU and BE); EMC at 15-20% with NW and Avamar; IBM at 15% with TSM; others at 15-25%. Yes those are pretty big ranges, but trying to be more accurate than that doesn't seem valid to me without data to back it up. Annecdotally, I think CA, CommVault, HP DP, probably have about 3-5% each. That leaves about 5% of the market for the current niche players, like Veeam.
Toggle Commented Nov 30, 2009 on Avamar v5.0 at The Backup Blog
Agreed totally. In fact as I considered the issues when writing this, and the absence of independent/objective tools, standards, and metrics for evaluating risk and cost of data loss and recovery, whether the insurance industry had soemthing that could be ported or translated to be useful for us in backup. And would be intelligible for a person of average mathematical ability (here in Canada actuaries are very highly trained in mathematics--well beyond my level!)
Toggle Commented Nov 24, 2009 on Do You Need Backup? at The Backup Blog
Paul; Please don't take offense! Veeam may be a truly fantastic product. But tracking everybody in this space is well nigh impossible, and Veeam appears to have ~ 1% market share. Just hadn't hit my personal radar yet. But honestly, I wish you folks all the success. There is no malice in my comments. It seems like you have a great team, and happy customers, and that is great to see. And by the way EMC NetWorker Fast Start has been a huge success for us, and is focused on the 1-20 server (backup client) segment. EMC has solutions for everybody from home/small office remote backup (Mozy), small business (NetWorker Fast Start), cloud backup, as well as the number one source and target deduplication solutions.
Toggle Commented Nov 21, 2009 on Avamar v5.0 at The Backup Blog
Veeam; Welcome to the conversation. In the spirit of being welcoming, I have posted your comment, although in the future it would be nice if comments felt less like advertising/spam. :) Having said that, my claim was "first major backup product." With appropriate respect, I am not sure Veeam falls into the major category! This is the first I have heard of your product. (Not that I am saying it is bad... and thanks for the comment because we can now follow your link and form some opinions.) But by market share, Veeam seems to be a non-entity? Sorry if that seems unfair, but by market share, there are a limited number of contenders: Symantec (NBU and BE), EMC (Networker/Avamar), IBM (TSM), and running fairly distant to these, HP (DP), CommVault, and CA (ArcServe).
Toggle Commented Nov 20, 2009 on Avamar v5.0 at The Backup Blog
Full points for Fx. Dead Vlei on the eastern edge of the Namib, near Sossusvlei. Spectacular place to visit--not easy to get to--but unforgettable.
Toggle Commented Nov 2, 2009 on The Backup Blog Returns at The Backup Blog
Curtis... "It's the system that makes it a backup..." Isn't that pretty much what I said? :) A copy is a necessary but not sufficient component of a backup system.
Toggle Commented Sep 29, 2009 on Copies and Backups Revisited at The Backup Blog
Tim; The auto media management point is tangential. Sorry. At least it ensures that new virtual media are added to the pool automatically. Ideally one script would do it all...!
Toggle Commented Sep 10, 2009 on Relabeling Tapes in NetWorker at The Backup Blog
Chuck; I like the idea of a bar code on the vApp a lot. Now extend the idea to data structures: you could have a data protection bar code (that describes the data protection policy to be applied to the container); I spoke about that here: http://thebackupblog.typepad.com/thebackupblog/2009/06/a-data-protection-taxonomy.html But why not also have a bar code for data replication and availability? Why make these bar codes objects with inheritable characteristics? Why not make some of the service providers that can act upon the policies contained in the bar code a part of vSphere? There is a lot of mileage in this approach, in my opinion.
While that is true, what we have done is lowered our marginal cost of backup, we have not achieved economies of scale. What I mean is that say I have 50 TB to back up. I might do this with tape for $300k in initial infrastructure. With deduplication, I might get by with $300k in initial infrastructure too (albeit with much higher performance and service levels than the tape solution). But assume I haven't sized the solution for growth. With tape, if I add a TB of data to my source, I now need to buy another tape drive. And another one for every 5 (?) TB after the initial 50. With disk, if I add another TB of data, same thing. For every TB I add to my source, I have to add capacity to my target in a linear fashion. That fashion might be .5x, but it is not like it becomes .2x at 100 TB. If anything, I see a jump in costs again as I need to acquire an additional robot or additional dedup head. So we are winning the battle--reducing costs--but losing the war because those costs continue to maintain a linear relationship with the cost/capacity of the source data.
Toggle Commented Aug 13, 2009 on Backup Sucks: Reason #38 at The Backup Blog
Thanks Stephen. I have made a modification so that the poll now reads "business data". I know there is a lot of other amibiguity, and I am going to leave things that way (including the archive stuff--my intuition is that cloud archive is going to happen easier and faster than cloud backup, btw). Unfortunately I lost all the votes when I did that--anybody who has already voted please feel free to do so again.
Toggle Commented Jul 24, 2009 on New Poll: Cloud Backup? at The Backup Blog
I would add that my anecdotal evidence shows DD folks are pretty excited to join too. The few folks that I have talked to have indicated they are very excited to be joining EMC, and very much looking forward to the next few months. I will admit that my sample size is small, so take this for what it is worth, but I have only positives to report from DD employees, no negatives so far. As far as selling less rather than more--well, EMC has been doing it successfully for years. Archiving is all about efficiency. Even at the most superficial level, Centera is cheaper than DMX/V-Max. EDLs with deduplication are less expensive than storage without. We offer dedup on Celerra. I have never had a hint of resistance internally on any of these strategies. Scott
Thanks for providing the references. I should have made it clear that you were citing 3rd party data to substantiate your claims, not just making stuff up as you went! I think we can both agree the numbers are a little out of data, and in the case of the value of lost data, perhaps a little suspect. And we can definitely agree that tape alone is not the most cost effective, reliable or secure way to protect data.
EMC continues to have an important and substantial business relationship with Quantum. We will also continue to sell and support the Quantum based DL systems as per standard EMC policy. (Meaning that you will still be able to get EMC service and support--they will not be immediately end of lifed or anything like that!)
Toggle Commented Jul 21, 2009 on Welcome Aboard Data Domain at The Backup Blog
Preston; The first two sets of numbers are from Double Take (for tape and their software). There is not much I can say in defense of them, as their white paper does a poor job of outlining the assumptions used to generate them. As I commented in the previous post to Curtis Preston, it is possible to imagine a "new" Avamar system for $5k in some circumstances. As for the tape being way out--maybe. I think the math comes to 40 tapes per site, with a robot, and a server at each site too. That is only $18,000 per year per site. That doesn't strike me as wildly unrealistic.