Share this post

July 23, 2015

“Made for Solid State” SDS Architecture – Key Considerations

Priyadarshi Prasad - Atlantis

Avatar
My previous blog is generating some interesting questions on why I think Atlantis USX is solid-state optimized vs. others in the SDS/HC category are not.
Hint – it is fundamentally about the architectural approach to treating solid state media in an optimized fashion
 
Let’s get into the details. It is important to understand the strengths and weaknesses of NAND flash (currently the most popular solid state media):
  1. While we all know Flash is finicky when it comes to writes, we often forget to include that in our consideration when evaluating architectures.
  2. Secondly, while Flash is faster than HDDs, it is still an order of magnitude slower than RAM.
 So starting with (1), if there is a technically solid and widely agreed reason to avoid writes to flash as much as possible, why are we willing to entertain architectures that do the exact opposite, and continuously harass flash with writes over and over again? Blatant examples of these architectures are the ones that have implemented post-process deduplication. Just imagine the torment flash goes through in a post-process dedup world:
  1. Application initiated write I/O are initiated and they get written. But dig deeper and you see what really happens...pdblog-10-(1).png
    • Protection overhead - With protection mechanisms, a single write on the front end (application initiated write) always results in many writes on the back end (One-to-Many). This “write protection overhead” is important of course so that data remains available even in the event of a device failure.
    •  Amplification overhead – SSD also has a write amplification factor. A single write results in multiple writes. Ouch! So writes are not only bad for SSDs, but also when you write to them, they end up writing more than you intended to.

             Solution - Write LESS to the backend with inline deduplication
 
  1. Next, a background process reads the previously written blocks, computes the fingerprints, figures out the duplicate blocks, writes reference counts and then erases those duplicate blocks. Again, the trick is to understand this erase operation…
    • pdblog-11.pngTurns out that when you tell an SSD to erase a block, it doesn’t really erase anything. All it does is simply marks that block as unused and moves on. However, as it runs out of free blocks to write new incoming I/Os from applications, it has no option but to look for unused but not yet erased blocks. Then it reads those blocks, erases them and then writes the incoming data. After a period of time, most writes start resulting in this “read -> erase -> write” sequence slowing SSD performance substantially and noticeably, getting worse as the SSD starts to fill up.  

             Solution - AVOID read->erase->write cycle with inline deduplication

By the way, don’t let anyone obfuscate the facts by stating that their de-duplication architecture computes hashes (fingerprints) inline and hence they are any better. The reality is that they still write all incoming write I/O (without deduplication). With their post-process dedup, they invariably put SSDs in the “read -> erase -> write” sequence very quickly slowing down the SSDs. Cool aid is no answer to cold facts.
 
Needless to say, Atlantis deduplication is inline and always has been since its inception.
 
While implementing inline deduplication is necessary to be solid-state optimized, it is not sufficient on its own. If not done right, inline de-duplication can result in a performance impact and CPU over-utilization. This is where some architectures tend to throw in the towel by introducing a specialized piece of hardware – thereby negating the whole premise of software-defined storage. No matter what your definition of software-defined is, it cannot include specialized hardware J
 
Atlantis architects sweated this out and implemented inline de-duplication that relies completely on commodity hardware. Moreover, by intelligently using RAM (remember, still significantly faster than flash), they do all metadata manipulation in RAM (so no flash wear) and thereby achieve a performance-oriented solution that has won the business and trust of numerous enterprise and financial institutions.
 
So there you have it – an architecture that is truly made-for-solid-state. Here is how I summarize the landscape.

pdblog-03.png
 
Let us know how we can help you accelerate your business, reduce cost, and mitigate risks. And join the conversation at @AtlantisSDS, @Priyadarshi_Pd.
 
 
 
12345
Current rating: 5 (3 ratings)