Tuesday, January 11, 2011

Netapp Storage Basics

Netapp Storage Basics

How does Netapp achieve storage efficiency? Lets talk about 6 ways.

1) Raid DP
Raid Double Parity has two parity disks. It can withstand 2 disk failures.

2) Thin provisioning (Flexvol , Flexshare)
Flexvol : Here physical volumes contain logical containers created on physical partition which is called an aggregate. What does this mean? It means that the flexvols can be increased, or decreased in size without concern to what aggregates are in use.

Flexshare : Because of flexvol, we can have a mix and match of applications running on the volume. We use flexshare to prioritize applications.

3) Thin Replication (SnapVault , SnapMirror)
We copy XMB of data from diskA to disk B. Lets add d(x) to diskA.
diskB = diskA + d(x)
The small change d(x) is what is referred to as thin. Thin replication refers only the small change d(x). d(x) requiries limited bandwidth.

SnapMirror : Here d(x) is replicated to ONE or MORE netapp storage systems. We can mirror synchronously, asynchronously and semisynchronously. We can cascade here.
diskB = diskA + d(x)
diskC = diskB + d(x)
diskD = diskC + d(x)

as opposed to

diskB = diskA + d(x)
diskC = diskA + d(x)
diskD = diskA + d(x)

As you can see, we are performing 3 reads from diskA. This is not good for performance. Here diskC gets the value of d(x) from diskB not diskA. This makes for quicker writing.

SnapVault: This is an ALTERNATIVE to tape backups. It uses IP based and disk based. Of course this is more reliable than tape backups.

4) Snapshot copies
These are read-only backups that can be retained for a long period of time. Snapshot copies hardly take space. How is this possible?

Snapshot copies points to blocks of storage.


VolA(time1) = A , B , C , D
SnapVolA(time1) = A, B , C, D

Lets assume a change in block C, called (C)

VolA(time2) = A , B , (C) , D
SnapVolA(time2) = SnapVolA(time1)= A, B, C, D

At this point, block C remains on disk.

Assuming we take a snapshot copy at time3

VolA(time3) = A, B, (C), D
SnapVolA(time3)= A, B, (C), D

Since neither VolA and SnapVolA do not point to block C, block C is deleted.

5) Virtual copies (Flexclone/ FlexCache)

FlexClone : Logical read/write clone of a flexvol volume. This is similar to snapshots but these are read/write instead of read only. Another difference between flexclone and snapshot is that flexclones can be split from the flexvols.
Why do we want to use flexclones? One common reason is to clone production environments for test and qa environments.

Flexcache :Here we have a storage caching solution. We distribute flexcaches to distribute read workload. Therefore data can be served from the flexcache closest to it. This reduces latency.

6) Deduplication

Here we find remove redundant blocks of data. How is this done? We find identical

We have File1 = B C D. We want to write File2 = E B A. Here we have 6 blocks of data.

Fingerprint file1 = F(B), F(C), F(D)
Fingerprint file2 = F(B), F(A), F(E)


Deduplication

We find a match with F(B). We go back to compare B in File1 and File2 and find out that they are the same. Now we are going to write File2 = E , File1(B), A. Here blockB in file2 is marked as free.


References

Introduction to Netapp Products, NetApp University Course. http://www.netapp.com/us/services/university/