Saturday, February 25, 2012

SAN drives and the transaction logs

In the old days when we have physical disk drives in our SQLServers, we
always located the logs on a seperate volume from the data files for
performance reasons. In the new era of using SAN storage for the database
files, does it still make sense to use seperate logical SAN drives for the
data and log files? Or will it make any performance difference for the data
and log files to be on the same SAN drive since we don't know where the data
is really physically going 'under the covers' of the SAN technology?
SAN vendors will tell you that it doesn't make any difference. Then they
will tell you to RAID5 all the disks to lower the entry cost of the array.
I continue to use the same best practices with separate LUNs for Data and
Log files. Plus those LUNS must be from different physical disk sets in the
array. On the highest-end arrays, the SAN may move stuff around to tune
itself, but for most of us, we still have to provision the storage system
manually. Just because we go from a semi-dumb system with megabytes of
cache to a moderately bright system having gigabytes of cache does not mean
the fundamentals change.
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP
"Jay Warmack" <JayWarmack@.discussions.microsoft.com> wrote in message
news:C4218672-9768-449F-AB68-95240B88828E@.microsoft.com...
> In the old days when we have physical disk drives in our SQLServers, we
> always located the logs on a seperate volume from the data files for
> performance reasons. In the new era of using SAN storage for the database
> files, does it still make sense to use seperate logical SAN drives for the
> data and log files? Or will it make any performance difference for the
> data
> and log files to be on the same SAN drive since we don't know where the
> data
> is really physically going 'under the covers' of the SAN technology?
|||"Jay Warmack" <JayWarmack@.discussions.microsoft.com> wrote in message
news:C4218672-9768-449F-AB68-95240B88828E@.microsoft.com...
> In the old days when we have physical disk drives in our SQLServers, we
> always located the logs on a seperate volume from the data files for
> performance reasons. In the new era of using SAN storage for the database
> files, does it still make sense to use seperate logical SAN drives for the
> data and log files?
Yes, it does. The SAN should be architected to provide multiple and separate
spindles.

> Or will it make any performance difference for the data
> and log files to be on the same SAN drive since we don't know where the
> data
> is really physically going 'under the covers' of the SAN technology?
Why would you not know?
Russ Kaufmann
MVP - Windows Server - Clustering
ClusterHelp.com, a Microsoft Certified Gold Partner
Web http://www.clusterhelp.com
Blog http://msmvps.com/clusterhelp
|||I agree with what Geoff and Russ said. Just want to add that in a large
enterprise, there is often tension between what an individual app wants and
what can be effectively managed and scale from the central storage
perspective. Treating all the storage requests generically often facilitates
SAN manageability and scalability, but may be sub-optimal for a particular
app.
If you are in a smaller environment or if your app is relatively important
in the grand scheme of 'SAN things', you may have greater leverage in making
more specific storage configuration requests, and you should take full
advanatge of that.

> files, does it still make sense to use seperate logical SAN drives for the
> data and log files?
It still makes sense to have data and log on different LUNs even if these
LUNs are not from completely different spindles in the array. If you look at
the end-to-end I/O path, this alone does give you some isolation.
Linchi
"Jay Warmack" wrote:

> In the old days when we have physical disk drives in our SQLServers, we
> always located the logs on a seperate volume from the data files for
> performance reasons. In the new era of using SAN storage for the database
> files, does it still make sense to use seperate logical SAN drives for the
> data and log files? Or will it make any performance difference for the data
> and log files to be on the same SAN drive since we don't know where the data
> is really physically going 'under the covers' of the SAN technology?
|||Yes, it still does. Keep in mind that with virtualized storage this can mean
that other high write applications can be using the same drives as your SQL
Server data and log files, so a SAN can offer lower performance than locally
attached storage. Most SANs have the capability of doing Zoning which is
where you can carve out disks in your SAN which you can dedicate to your
logs, or to your data files. These disks can similarly be made RAID 10 or
RAID 5 - whatever makes most sense for your SQL Server.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Jay Warmack" <JayWarmack@.discussions.microsoft.com> wrote in message
news:C4218672-9768-449F-AB68-95240B88828E@.microsoft.com...
> In the old days when we have physical disk drives in our SQLServers, we
> always located the logs on a seperate volume from the data files for
> performance reasons. In the new era of using SAN storage for the database
> files, does it still make sense to use seperate logical SAN drives for the
> data and log files? Or will it make any performance difference for the
> data
> and log files to be on the same SAN drive since we don't know where the
> data
> is really physically going 'under the covers' of the SAN technology?
|||First of all, I want to say that I agree with all of the other respondents;
however, I would like to point out a few misconceptions when dealing with
consolidation, whether that may be Server and/or Storage consolidation.
So, to begin, at the nut of things, we are still working with physical
devices, from the host, through the bus, out through the network, between
the layers, and to the disks, but at the end of the day, they are still
disks.
Now, pick a disk, any disk, and let's say it's a 15K RPM drive. How many
sustained IOPS can it support? Does that change if the IO stream comes from
a single host or from an aggregated set of hosts? What if the IO is
directed by an intelligent controller?
Yes, it was rhetorical. The answer is that it makes absolutely no
difference. With slight variations among the vendors, a practical limit
would be about 150 to 200 IOPS per device, at best. Other than capacity and
rotational speed, not much has changed in disk throughput in nearly 20
years. To combat this physical limitation, RAID and controller cache was
created; thus, we stripe disks, and then we stripe again. We buy as much
memory as we can, on each of the sub-layers, and we cache. We do everything
in our power to keep from using those disks as often as we can.
This gets to the next point: consolidation. Now, let's say that we want to
build a single large SQL Server installation and consolidate all of the
little independent installations to this one. How do you size the
consolidated server?
Again, it is rhetorical. The same way as if you were building a single
large database installation. You have 4 main sub-systems to size: CPU,
Memory, Disk IO, and Network IO. The system must be built to handle the
aggregated load from all of the applications just as if they were all
directed at a single solution.
Now, consolidated systems tend to drive randomness to an extreme, which also
lowers the throughput of the underlying sub-systems.
Storage consolidation is no different. Just because we can buy larger
capacity drives and consolidate all of our disparate installations onto a
single storage fabric does not mean we have sufficient IO throughput. We
have to consider all metrics, just like system consolidation.
Unfortunately, disks capacities have grown far faster than their respective
IO limitations. 5,400 RPM drives can support about 50 random IOPS; 7,200
RPM, about 70 IOPS; 10K RPM, 100 to 120 IOPS; and 15K RPM, 150 IOPS.
What is the typical OLTP database size? What is a typical associated IOPS
for such a size?
My experience says about 1 to 10 GB, perhaps 50 to 100 GB would not be
unheard of. Anything larger and you would be needing to discuss archiving,
or at least partitioning some of the data, or you are not talking about a
strictly OLTP system anymore. Once you get into DSS and OLAP system
discusses, size and throughput start to converge; so, one does not tend to
see these problems as much on those types of systems. So, for now, let's
stay with the OLTP discussion.
Now, what about the IOPS? Well, that's the funny thing: there is no
correlation between size and usage. There may be a general linear increase
relationship, but I have seen busy, small and large, as well as idle small
and large. Nevertheless, 500 to 1,000 random IOPS would not be atypical for
any reasonably scaled solution. Coupled with the periodic Checkpoint
process, 10,000 IOPS and more would certainly not be out of the question.
In the case of server consolidation, these levels tend to be about the same;
however, the IOPS tend to reduce the burstiness and level off, but maintain
the same high IO request limits, but now, they are sustained levels.
By way of example, let us say we have 10 databases to consolidate, each
about 10 GB in size, each about 500 to 1,000 sustained random IOPS.
How many disks (internal or SAN-based) would be needed to store the data?
One, probably, perhaps 2 to 4 smaller ones. We would be talking about only
from 100 to 200 GB in aggregate.
How about the IOPS? How many disks? At least 30 to 70.
Now, after all the SAN Administrators pick their jaws up off the floor,
consider this.
What if we are NOT consolidating servers? What if, instead, we only
consolidate the storage on a SAN, but maintain the same number of 10
individual installations. If you virtualize the storage, ask yourself the
same questions. How many disks to store the data? How many disks to
support the IOPS load?
...
You better say the same number, as a minimum, and perhaps more because under
a consolidated server, the host knows about the IO requirements, but under
the Storage-only consolidation, each host is "blind" to the others, and, as
such, IO contention can increase.
If you answered any other way, then you know now exactly why your SAN is not
performing as well as dedicated arrays.
Finally, to make one last point, YOU BETTER KNOW THE UNDERLYING DISK
CONFIGURATION, you're the DBA; it's your job! You will need to spend lots
and lots of time with your SAN Administrators. I've always found bribes to
work best.
Sincerely,
Anthony Thomas

"Jay Warmack" <JayWarmack@.discussions.microsoft.com> wrote in message
news:C4218672-9768-449F-AB68-95240B88828E@.microsoft.com...
> In the old days when we have physical disk drives in our SQLServers, we
> always located the logs on a seperate volume from the data files for
> performance reasons. In the new era of using SAN storage for the database
> files, does it still make sense to use seperate logical SAN drives for the
> data and log files? Or will it make any performance difference for the
data
> and log files to be on the same SAN drive since we don't know where the
data
> is really physically going 'under the covers' of the SAN technology?

No comments:

Post a Comment