MSFN Forum: HDD performance <-> Allocation unit size - MSFN Forum

Jump to content


Hard Drive and Removable Media issues Rules

If you have questions about Seagate 7200.11, do read the READ_ME_FIRST, then read the FGA. If your questions remain unanswered after reading those two stickies, then post. For all other Hard Drive and Removable Media issues, you may post right away.
  • 3 Pages +
  • 1
  • 2
  • 3
  • You cannot start a new topic
  • You cannot reply to this topic

HDD performance <-> Allocation unit size Rate Topic: -----

#21 User is offline   DiracDeBroglie 

  • Junior
  • Pip
  • Group: Members
  • Posts: 88
  • Joined: 07-December 11
  • OS:Windows 7 x64
  • Country: Country Flag

Posted 18 June 2012 - 05:49 AM

View Postjaclaz, on 18 June 2012 - 02:22 AM, said:

Which role (if any) would NCQ (Native Command Queuing) have in the benchmark(s)?
jaclaz


In ATTO Disk Benchmark I tried out Queque Depth (QD)=4 and =10. On my recently purchased Win7 notebook there was no difference in the performance test between QD=4 and =10. However, on my 8 year old WinXP notebook, the difference in the performance test between QD=4 and =10 is clearly visible in the output graph. I am not sure what the exact relation is between QD in ATTO, and NCQ. Furthermore, my HDDs have an NCQ depth of 32, while QD in ATTO goes only to a maximum of 10. On my Win7 notebook, I also don't see the relevance of NCQ or QD if large chunks of data (from sector X to sector Y) from the HDD are being dumped into the RAM-DMA during the performance test with ATTO; there is no need for complicated searching on the HDDs where the read/write heads need to wobble over the platters.

Well, the performance of the HDDs (internal and external) on my Win7 notebook is what it is, there is no way to go beyond it physical limits. To me, the most important is that I can get the maximum out of the HDDs and also understand why and how particular parameters in the hardware (HDDs) related to the performance (optimization).

By the way, some time ago we had discussion about misaligned partitions in advanced format HDDs (4K sector drives), like my 2TB exteranl USB3.0 drive. I now performed an ATTO Disk Benchmark test on a partition (on the 2TB HDD) which was first correctly partition-aligned after being created under Win7, and did the ATTO test again after the partition was deleted and re-created under WinXP (so misaligned). The WinXP-created partition had a slightly lower read performance (1% less, not even that maybe) compared to the Win7-created partition. However, the write performance for the WinXP-created partition was something like 10% less than for the Win7-created partition. I get the impression that partition-misalignment may not be that much of an performance issue in advanced format drives. All performance test (including for the WinXP-created partition) where done on the Win7 notebook.

In the process of doing those test, I ran into trouble with my 2TB HDD, which is also my data backup drive. The first partition is a primary partition, followed by an extended partition containing 4 logical partitions; all partition were created under Win7. Then, on my WinXP notebook, after having deleted and re-created the first (primary) partition, the second, third and fourth logical partitions disappeared; the first logical partition, however, and the extended (shell) partition remained intact (checked that out with PTEdit32 and PartInNT). I tried to retrieve the data from the lost partitions using GPartED but after almost 10 hours of "retrieving", GPartED gave up. So, all my data on those 3 logical partitions is gone. Lesson to be learned: WinXP and Win7 are not quite compatible when it comes to partitioning (which I already knew) and that can have disastrous consequences (that I have now learned the hard way).

johan


#22 User is offline   jaclaz 

  • The Finder
  • Group: Developers
  • Posts: 11,574
  • Joined: 23-July 04
  • OS:none specified
  • Country: Country Flag

Posted 18 June 2012 - 06:31 AM

View PostDiracDeBroglie, on 18 June 2012 - 05:49 AM, said:


In the process of doing those test, I ran into trouble with my 2TB HDD, which is also my data backup drive. The first partition is a primary partition, followed by an extended partition containing 4 logical partitions; all partition were created under Win7. Then, on my WinXP notebook, after having deleted and re-created the first (primary) partition, the second, third and fourth logical partitions disappeared; the first logical partition, however, and the extended (shell) partition remained intact (checked that out with PTEdit32 and PartInNT). I tried to retrieve the data from the lost partitions using GPartED but after almost 10 hours of "retrieving", GPartED gave up. So, all my data on those 3 logical partitions is gone. Lesson to be learned: WinXP and Win7 are not quite compatible when it comes to partitioning (which I already knew) and that can have disastrous consequences (that I have now learned the hard way).

Yep, that is seemingly "by design" (as the good MS guys would put it) :w00t: .
Sorry for your mishap, but you are not the first one:
http://reboot.pro/9897/
AND you were been already warned :ph34r: about the issue:
http://www.msfn.org/...post__p__984342
BTW the data is normally perfectly recoverable, that is if you use properly a data recovery app that GPARTED is not AFAIK.

jaclaz

#23 User is offline   DiracDeBroglie 

  • Junior
  • Pip
  • Group: Members
  • Posts: 88
  • Joined: 07-December 11
  • OS:Windows 7 x64
  • Country: Country Flag

Posted 18 June 2012 - 10:35 AM

View Postjaclaz, on 18 June 2012 - 06:31 AM, said:

BTW the data is normally perfectly recoverable, that is if you use properly a data recovery app that GPARTED is not AFAIK.
jaclaz


Based on your experience, which recovery apps/soft are worth having at hand when it comes to this sort of partition (table) damage?

johan

PS: In Win7 I deleted the very first (primary and WinXP-created) partition (on my 2TB HDD), and then created a New Simple Volume in the Disk Manager in Win7. After checking the 2TB HDD with GPartED I noticed there was a gap of UNallocated disk space between the newly created primary partition and the next extended partition shell (which has never been damaged) of exactly 1MB. I could only see the 1MB gap in GPartED, but not at all in the Disk Manager in Win7. So, in GPartED I got rid of the 1MB gap by expanding/extending the newly created primary partition.
But also the very last partition (which is primary one following the extended (shell) partition) was followed by an UNallocated area of 2MB!!?? I got rid of that too by extending the last prim partition in GPartED. But also here the 2MB gap was not visible in the Win7 DM.

This post has been edited by DiracDeBroglie: 18 June 2012 - 10:47 AM


#24 User is offline   jaclaz 

  • The Finder
  • Group: Developers
  • Posts: 11,574
  • Joined: 23-July 04
  • OS:none specified
  • Country: Country Flag

Posted 18 June 2012 - 11:49 AM

View PostDiracDeBroglie, on 18 June 2012 - 10:35 AM, said:

Based on your experience, which recovery apps/soft are worth having at hand when it comes to this sort of partition (table) damage?

It depends.
For a semi-automated attempt, TESTDISK.
For manual recovery I tend to use Tiny Hexer (with my small viewers/templates for it).
A very good app in my view is dmde (though very powerful, more handy for filesystem recovery)
BUT read this:
http://www.msfn.org/...post__p__943645
(and links within it)


View PostDiracDeBroglie, on 18 June 2012 - 10:35 AM, said:


PS: In Win7 I deleted the very first (primary and WinXP-created) partition (on my 2TB HDD), and then created a New Simple Volume in the Disk Manager in Win7. After checking the 2TB HDD with GPartED I noticed there was a gap of UNallocated disk space between the newly created primary partition and the next extended partition shell (which has never been damaged) of exactly 1MB. I could only see the 1MB gap in GPartED, but not at all in the Disk Manager in Win7. So, in GPartED I got rid of the 1MB gap by expanding/extending the newly created primary partition.
But also the very last partition (which is primary one following the extended (shell) partition) was followed by an UNallocated area of 2MB!!?? I got rid of that too by extending the last prim partition in GPartED. But also here the 2MB gap was not visible in the Win7 DM.

Well, the general idea is to STOP fiddling with a disk as soon as you find an issue :ph34r:.

In Windows XP "way of thinking", since everything is Cylinder boundary related, only steps of around 8 Mb are "sensed" by the Disk manager (1x255x63=16065x512=8225280).
Cannot say about 7, but most probably it has similar issues but with the same "pre-sets" that affect diskpart and that can be overridden through the Registry:
http://www.911cd.net...showtopic=21186
http://www.911cd.net...pic=21186&st=18
http://support.micro...kb/931760/en-us

The 1 Mb (which i s not a "measure", bytes are units, Megabytes or Mibibytes or whatever can be displayed in several different way by different utilities using different conventions) could be "normal", the 2 Mb less so :unsure:

A typical Vista :ph34r: or 7 first partition is aligned to 1 Mb (2048 sectors), but there may be variations, it is possible that - for any reason - the gap you found around 1 Mb in size is aactually smaller than 1 Mb (and thus results in "a suffusion of yellow") and that the other around 2 Mb gap is not a myltiple of 1 Mb and thus is ignored.
Cannot say.

Here is an example of a partition recovery (to give you an idea of the kind of approach):
http://www.msfn.org/...-after-bsy-fix/
and another one:
http://www.msfn.org/...ith-value-data/

jaclaz

#25 User is offline   pointertovoid 

  • Advanced Member
  • PipPipPip
  • Group: Members
  • Posts: 406
  • Joined: 16-January 09

Posted 18 June 2012 - 04:00 PM

View PostDiracDeBroglie, on 17 June 2012 - 06:29 AM, said:

...from your explanation I infer that large chunks of data, linearly scooped up from the HDD (from sector X to sector Y), are being dumped into the RAM area that is foreseen for the DMA on the motherboard!?

Bonsoir - Goedenavond Johan! Yes, this is the way I believe to understand it. Until I change my mind, of course.

View PostDiracDeBroglie, on 17 June 2012 - 06:29 AM, said:

...with *Direct I/O* UNchecked and the results were stunning; the graphical performance reading in ATTO Disk Benchmark went up to 1600 MBytes/Sec, almost 3 times the maximum SATA-III bandwith!!! Hence that I think that with buffering or caching ATTO means the RAM-DMA on the motherboard...

What Atto could mean with "direct I/O" is that is does NOT use Windows' system cache.

The system cache (since W95 at least) is a brutal feature: every read or write to disk is kept in main Ram as long as no room is requested in this main memory for other puposes. If a data has been read or written once, it stays available in Ram.

Win takes hundreds of megabytes for this purpose, with no absolute size limit. So the "cache" on a hard disk, which is much smaller than the main Ram, cannot serve as a cache with Win, since this OS will never re-ask for recent data. Silicon memory on a disk can only serve as a buffer - which is useful especially with Ncq.

Very clear to observe with a diskette. Reading or writing on it takes many seconds, but re-opening a file already written is immediate.

-----

Cluster size matters little as a result of the Udma command. It specifies a number of sectors to be accessed, independently of cluster size and position. A disk doesn't know what a cluster is. Only the OS knows it to organize its Mft (or Fat or equivalent) and compute the corresponding sector address. The only physical effect on a disk is where files (or file chunks if fragmented) begin; important on Flash storage, less important in a Raid of mechanical disks, zero importance on a single mechanical disk where tracks have varied unpredictable bizarre sizes within one disk.

This post has been edited by pointertovoid: 18 June 2012 - 04:20 PM


#26 User is offline   pointertovoid 

  • Advanced Member
  • PipPipPip
  • Group: Members
  • Posts: 406
  • Joined: 16-January 09

Posted 18 June 2012 - 04:41 PM

View PostDiracDeBroglie, on 17 June 2012 - 07:38 AM, said:

... I may purchase an SSD in the near future.... do you know any documents, websites, links, references, or whatever reading, which could give me a deeper insight about how SSDs are designed and work?


Yes, get an Ssd at least for the OS, your documents, and small applications. They're hugely better than mechanical disks for any data that fits the limited size. Affordable as used disks on eBay: I got a used Corsair F40 (40GB MLC with Sandforce) delivered for 50€, it's excellent.

Their efficient use is difficult to understand and implement, and little simple reading is available. Wiki?
The buyer-user should understand in advance SLC vs MLC, especially the implications for write delay, and the resulting need for a good cache AND cache strategy, which means a controller and its firmware, all-important with MLC. Older MLC are bad.
Write leveling is also very important to understand but not simple, especially as write performance drops over time with older Win. With Seven (or Vista?) and up, write performance shouldn't drop; with Xp, you can tweak the same effect; with W2k I couldn't let these tweaks run - but my disk is an SLC X25-E anyway :whistle: , nearly insensitive to anything, and so fast that it awaits the Cpu.
Alignment is important for speed; some measurements there http://www.msfn.org/...n-its-clusters/ on a CF card with Ntfs (easy) and Fat32 (needs tweaking).
Dandu.be made exhaustive measurements but he didn't tweak alignement, alas.

#27 User is offline   pointertovoid 

  • Advanced Member
  • PipPipPip
  • Group: Members
  • Posts: 406
  • Joined: 16-January 09

Posted 18 June 2012 - 05:16 PM

View Postjaclaz, on 18 June 2012 - 02:22 AM, said:

Which role (if any) would NCQ (Native Command Queuing) have in the benchmark(s)?


NCQ makes a huge difference in real life. When booting W2k (whose disk access isn't as optimized as by Xp with its prefetch) it means for instance 18s instead of 30s, so Ahci should always be used, even if it requires an F6 diskette.

Different disks have different Ncq firmware resulting in seriously different experienced performance, and this cannot be told in advance from the disks' datasheet, nor from benchmarks of access time and contiguous throughput. Anyway, the arm positioning time between near tracks should be more important than random access over the whole disks, which usage never requires.

Few benchmarks make a sensible measurement here. Ncq is meaningful only if a request queue is used, but for instance Atto isn't very relevant here; it seems to request nearly-contiguous accesses, which give unrealistic high performance and show little difference between Ncq strategies; nice tool however to observe quickly if a volume is aligned on a Flash medium. The reference here is IOMeter but it's not easy to use, especially for access alignment. Recent CrystalDiskMark tests with Q=32 which is far too much for a workstation (Q=2 to 4).

Example of IOMeter with Q=1, versus Q=4 of one 600GB Velociraptor, versus Q=4 of four VRaptors in aligned Raid-0, on random 4k reads over 100MB (but 500MB for the Raid) (an SSD would smash N* 10k IO/s in this test):

Attached File  IOM_100MB_k004r_Q01.png (14.54K)
Number of downloads: 5

Attached File  IOM_100MB_k004r_Q04.png (14.55K)
Number of downloads: 5

Attached File  IOM500MB_1MiB_k004r_Q04.png (14.18K)
Number of downloads: 4

(log in to see the image, click to magnify)

As a practical consequence, an Ssd should be chosen based on few available tests available only from users... IOMeter, if not then CrystalDiskMark which at least tells the random delay for small writes, or good as well: "practical" comparisons at copying or writing many files. On the disks I own, I check the "find" time of some file names among ~100.000 Pdf in many subdirectories, the write time of many tiny files...

This post has been edited by pointertovoid: 18 June 2012 - 05:24 PM


#28 User is offline   jaclaz 

  • The Finder
  • Group: Developers
  • Posts: 11,574
  • Joined: 23-July 04
  • OS:none specified
  • Country: Country Flag

Posted 18 June 2012 - 08:00 PM

View Postpointertovoid, on 18 June 2012 - 05:16 PM, said:

View Postjaclaz, on 18 June 2012 - 02:22 AM, said:

Which role (if any) would NCQ (Native Command Queuing) have in the benchmark(s)?


NCQ makes a huge difference in real life. When booting W2k (whose disk access isn't as optimized as by Xp with its prefetch) it means for instance 18s instead of 30s, so Ahci should always be used, even if it requires an F6 diskette.

Different disks have different Ncq firmware resulting in seriously different experienced performance, and this cannot be told in advance from the disks' datasheet, nor from benchmarks of access time and contiguous throughput. Anyway, the arm positioning time between near tracks should be more important than random access over the whole disks, which usage never requires.

Few benchmarks make a sensible measurement here. Ncq is meaningful only if a request queue is used, but for instance Atto isn't very relevant here; it seems to request nearly-contiguous accesses, which give unrealistic high performance and show little difference between Ncq strategies; nice tool however to observe quickly if a volume is aligned on a Flash medium. The reference here is IOMeter but it's not easy to use, especially for access alignment. Recent CrystalDiskMark tests with Q=32 which is far too much for a workstation (Q=2 to 4).


Really? :unsure:

Guess why I had previously posted this? :whistle:

View Postjaclaz, on 25 May 2012 - 09:35 AM, said:

The whole point is that benchmarks are - generally speaking - benchmarks ;) and they are ONLY useful to compare different settings (or different OS or different hardware), BUT the results need to be verified.
In no way thy are (or can be) representative of "real usage".
In other terms, it is perfectly possible that the result of a benchmark (which is an "absract" set of copying data with a given method) seem "almost the same" but on real usage a BIG difference is actually "felt", or viceversa, it is perfectly possible that in a benchmark a given setting produces an astoundingly "better" result, but then when the setting is applied in "real life" no (or very little difference) is "felt".


and why I asked the qiestion in the form:

View Postjaclaz, on 18 June 2012 - 02:22 AM, said:

Which role (if any) would NCQ (Native Command Queuing) have in the benchmark(s)?

:rolleyes:



jaclaz

#29 User is offline   DiracDeBroglie 

  • Junior
  • Pip
  • Group: Members
  • Posts: 88
  • Joined: 07-December 11
  • OS:Windows 7 x64
  • Country: Country Flag

Posted 19 June 2012 - 09:44 AM

View Postpointertovoid, on 18 June 2012 - 05:16 PM, said:


NCQ makes a huge difference in real life. When booting W2k (whose disk access isn't as optimized as by Xp with its prefetch) it means for instance 18s instead of 30s, so Ahci should always be used, even if it requires an F6 diskette.

Different disks have different Ncq firmware resulting in seriously different experienced performance, and this cannot be told in advance from the disks' datasheet, nor from benchmarks of access time and contiguous throughput. Anyway, the arm positioning time between near tracks should be more important than random access over the whole disks, which usage never requires.

Few benchmarks make a sensible measurement here. Ncq is meaningful only if a request queue is used, but for instance Atto isn't very relevant here; it seems to request nearly-contiguous accesses, which give unrealistic high performance and show little difference between Ncq strategies; nice tool however to observe quickly if a volume is aligned on a Flash medium. The reference here is IOMeter but it's not easy to use, especially for access alignment. Recent CrystalDiskMark tests with Q=32 which is far too much for a workstation (Q=2 to 4).

Example of IOMeter with Q=1, versus Q=4 of one 600GB Velociraptor, versus Q=4 of four VRaptors in aligned Raid-0, on random 4k reads over 100MB (but 500MB for the Raid) (an SSD would smash N* 10k IO/s in this test):


What is a F6 diskette?

>>The reference here is IOMeter but it's not easy to use, especially for access alignment.

What is access alignment?

-----------------------------------
I did the IOmeter test too on my 1TB Seagate Barracuda 7200.12 system HDD, with 3 different options in the pane |Disk Targets| -> |# of Oustanding I/O|=1, then =4 and again =32; I suppose that must be the Queue Depth equivalent to (or equal to) the NCQ in the HDD (correct me if I'm wrong). I'm not all to sure how to interpret the options |Maximum Disk Size (in sectors)| and |Starting Disk Sector|; do these need a figure?? I left them at zero during my tests.

In the pane |Access Specifications| I've put the value |4K; 100%Read; 0%Random|. The results from the tests are showns in the screenshots hereafter for #I/O=1, =4 and =32.


IOmeterQD1; IOmeterQD4; IOmeterQD32


The difference in performance between #I/O=1 and =4 seems significant; the difference between #I/O=4 and =32 is less, but still noticable. Anyhow, with #I/O=32 the performance comes pretty close to what I had earlier with ATTO Disk Benchmark.

I also performed a test with #I/O=64 (no screenshot); the performance was better than with #I/O=1, but poorer than with #I/O=4. It looks like offering too many (>32) parallel I/Os chokes the NCQ handling in the HDDs controller.

Still a question about the |Access Specifications| pane where I've put |4K; 100%Read; 0%Random|. All possible options were with 0% Random. I'm not sure what is meant with "Random"; does it mean random sector location of the HDDs platters? If yes, then 0% Random should mean 100% non-Random, right? If yes, 100% non-Random should mean 100% sequential (contiguous), which basically puts us into the same context of ATTO. But if both IOmeter and ATTO perform nearly-contiguous measurements, why does |# of Oustanding I/O| in IOmeter impacts the performance so much, while the Queue Depth in ATTO does not nearly influences the performance? Sticking point is how to interpret "Random" in |4K; 100%Read; 0%Random| for IOmeter?

johan

PS: I couldn't figure out how to introduce the image file inline in the text; they are now attached at the back.

Attached File(s)


This post has been edited by DiracDeBroglie: 19 June 2012 - 09:46 AM


#30 User is offline   jaclaz 

  • The Finder
  • Group: Developers
  • Posts: 11,574
  • Joined: 23-July 04
  • OS:none specified
  • Country: Country Flag

Posted 19 June 2012 - 09:52 AM

View PostDiracDeBroglie, on 19 June 2012 - 09:44 AM, said:

What is a F6 diskette?

The "traditional" method to add a specific driver that is not already included in a NT based system.
See:
http://www.msfn.org/...a-floppy-drive/

Typical is a Mass Storage device driver.
When you start a NT setup you are asked to press F6 to add the driver through a floppy disk.

At least in XP (cannot say in 2k) is possible (on motherboard that provide both Legacy IDE emulation and AHCI) to install "normally" and then install and switch the driver:
http://www.msfn.org/...409#entry884409

jaclaz

#31 User is offline   DiracDeBroglie 

  • Junior
  • Pip
  • Group: Members
  • Posts: 88
  • Joined: 07-December 11
  • OS:Windows 7 x64
  • Country: Country Flag

Posted 19 June 2012 - 11:53 AM

View Postpointertovoid, on 18 June 2012 - 04:00 PM, said:

The system cache (since W95 at least) is a brutal feature: every read or write to disk is kept in main Ram as long as no room is requested in this main memory for other puposes. If data has been read or written once, it stays available in Ram. Win takes hundreds of megabytes for this purpose, with no absolute size limit.

Is this some kind of a "RAM disk", but where its content is a perfect mirror of its respective file content on the HDD? (so that a powerfailure doesn't mess up file consistency on the HDD?)

Does that mean that the UDMA controller has access to whatever motherboard RAM size (hundreds of MBs, or even GBs) and where-ever located in the motherboard RAM?

I know that on old OSs there was a thing like RAM disk, but its content was not always consistent with the HDD, floppy, JAZZ, Zip drive content. As a result powerfailure was usually very damaging to your files on the HDD. I assume that kind of powerfailure damage cannot happen when using DMA for data transfer between motherboard RAM and the HDD? Or can it?

johan

This post has been edited by DiracDeBroglie: 19 June 2012 - 11:59 AM


#32 User is offline   pointertovoid 

  • Advanced Member
  • PipPipPip
  • Group: Members
  • Posts: 406
  • Joined: 16-January 09

Posted 19 June 2012 - 12:19 PM

View PostDiracDeBroglie, on 19 June 2012 - 09:44 AM, said:

[Numbering and slicing by Pointertovoid, apologies for my bad manners]
(1) # of Oustanding I/O|=1, then =4 and again =32; I suppose that must be the Queue Depth equivalent to (or equal to) the NCQ in the HDD
(2) I'm not all to sure how to interpret the options |Maximum Disk Size (in sectors)| and |Starting Disk Sector|;
(3) I've put the value |4K; 100%Read; 0%Random|.
(4) I couldn't figure out how to introduce the image file inline in the text; they are now attached at the back.


(1) I too understand it as the number of simultaneous requests. 32 is huge, fits servers only. 2 or 4 is already much for a W2k or Xp workstation.
(2) IOMeter measures within a file of the size you define, beginning where you wish if possible (I hope it checks it before). The file size should represent the size of the useful part of the OS or application you want to access; 100MB is fine for W2k but too small for younger OS. After the first run begins, I stop it and defragment the volume so this file is contiguous.
(3) You lost. 0% random means contiguous reads, telling why you get as unrealistic figures as with Atto. I measure with 100% random, which is pessimistic. You define it by editing (or edit-copy) a test from the list; a cursor adjusts that.
(4) Something like "place inline" after the file is uploaded and the cursor is where you want.

#33 User is offline   pointertovoid 

  • Advanced Member
  • PipPipPip
  • Group: Members
  • Posts: 406
  • Joined: 16-January 09

Posted 19 June 2012 - 12:27 PM

View PostDiracDeBroglie, on 19 June 2012 - 11:53 AM, said:

[System cache]
Is this some kind of a "RAM disk", but where its content is a perfect mirror of its respective file content on the HDD? (so that a powerfailure doesn't mess up file consistency on the HDD?)

Does that mean that the UDMA controller has access to whatever motherboard RAM size (hundreds of MBs, or even GBs) and where-ever located in the motherboard RAM?


I believe to understand that read or write requests to still unaccessed sectors are sent to the disk immediately (unless you check the "delayed write" in the driver properties) but any data read or written stays in the mobo Ram as long as possible for future reuse - that is, as long as Ram space is available. This means that the disk is still updated immediately (if no delayed write).

And, yes, I believe the Udma controller accesses any page of the mobo Ram. Which isn't very difficult if you see a block diagram of the memory controller.

#34 User is offline   pointertovoid 

  • Advanced Member
  • PipPipPip
  • Group: Members
  • Posts: 406
  • Joined: 16-January 09

Posted 19 June 2012 - 12:34 PM

Jaclaz: and why I asked the question in the form:

View Postjaclaz, on 18 June 2012 - 02:22 AM, said:

Which role (if any) would NCQ (Native Command Queuing) have in the benchmark(s)?


With IOMeter, Ncq effect is very observable; very few other benchmarks show an effect; and the amount of improvement shown by IOMeter is consistent with real-life experience.

Just adding some duplicate plethoric redundant repetition again, to confirm we agree...

#35 User is offline   jaclaz 

  • The Finder
  • Group: Developers
  • Posts: 11,574
  • Joined: 23-July 04
  • OS:none specified
  • Country: Country Flag

Posted 19 June 2012 - 01:11 PM

View Postpointertovoid, on 19 June 2012 - 12:34 PM, said:

Just adding some duplicate plethoric redundant repetition again, to confirm we agree...

Sure we agree on this :thumbup .

The question was aimed at DiracDeBroglie, meant as a hint :rolleyes: towards the fact that Atto is not the "ideal" benchmark for a SATA III drive with NCQ ;).

jaclaz

#36 User is offline   DiracDeBroglie 

  • Junior
  • Pip
  • Group: Members
  • Posts: 88
  • Joined: 07-December 11
  • OS:Windows 7 x64
  • Country: Country Flag

Posted 19 June 2012 - 02:20 PM

What a disappointment!! With 10%random the Total MBs per Second drop from 90 (with 0% random) to about 3.5. With 100%random the performance went down to 1.5MB/Sec!!! Its time for an SSD. #I/O was all the time 32.

View Postpointertovoid, on 19 June 2012 - 12:19 PM, said:

(1) I too understand it as the number of simultaneous requests. 32 is huge, fits servers only. 2 or 4 is already much for a W2k or Xp workstation.
(2) IOMeter measures within a file of the size you define, beginning where you wish if possible (I hope it checks it before). The file size should represent the size of the useful part of the OS or application you want to access; 100MB is fine for W2k but too small for younger OS. After the first run begins, I stop it and defragment the volume so this file is contiguous.


Oh, I forgot to ask in my previous post: How can one create a test file of a particular length in IOmeter?

Thanks
j

This post has been edited by DiracDeBroglie: 19 June 2012 - 02:24 PM


#37 User is offline   DiracDeBroglie 

  • Junior
  • Pip
  • Group: Members
  • Posts: 88
  • Joined: 07-December 11
  • OS:Windows 7 x64
  • Country: Country Flag

Posted 20 June 2012 - 08:22 AM

Specifications of tested HDD: Seagate Barracuda 7200.12 (ST31000524AS); 1TB, 32MB cache; Queue Depth =32; NCQ =supported; average track size =0.8MB.

Some more IOmeter test results with parameters: 4K, 100%Read, a range of %s for randomness, and for 4 different numbers of simultaneous I/0s values (32; 4; 2; 1). The following figures show the throughput (Total MBs per Second) as a function of the % randomness. Test file = 500MB (Can be set by specifying the number of sectors in the option Disk Targets -> Maximum Disk Size; I my case I specified 1,000,000 sectors, as 1 sector is 512 bytes; Starting Disk Sector was kept at 0)

# simultaneous I/Os =32
------------------------
0% randomness -> 95 MB/s
1% randomness -> 96 MB/s (a bit higher than in the 0% case)
2% randomness -> 6.8 MB/s
3% randomness -> 6.3 MB/s
4% randomness -> 5.3 MB/s
5% randomness -> 5.0 MB/s
10% randomness -> 4.1 MB/s
20% randomness -> 3.2 MB/s
50% randomness -> 2.1 MB/s
100% randomness -> 1.4 MB/s

# simultaneous I/Os =4
------------------------
0% randomness -> 76 MB/s
1% randomness -> 84 MB/s (8 MB/s higher than in the 0% case)
2% randomness -> 17 MB/s
3% randomness -> 14 MB/s
4% randomness -> 10 MB/s
5% randomness -> 9.2 MB/s
10% randomness -> 4.8 MB/s
20% randomness -> 2.6 MB/s
50% randomness -> 1.4 MB/s
100% randomness -> 1.1 MB/s

# simultaneous I/Os =2
------------------------
0% randomness -> 67 MB/s
1% randomness -> 67 MB/s
2% randomness -> 14 MB/s
3% randomness -> 12 MB/s
4% randomness -> 9.8 MB/s
5% randomness -> 8.2 MB/s
10% randomness -> 4.7 MB/s
20% randomness -> 2.6 MB/s
50% randomness -> 1.2 MB/s
100% randomness -> 0.8 MB/s

# simultaneous I/Os =1
------------------------
0% randomness -> 43 MB/s
1% randomness -> 40 MB/s
2% randomness -> 9.8 MB/s
3% randomness -> 8.2 MB/s
4% randomness -> 7.2 MB/s
5% randomness -> 6.4 MB/s
10% randomness -> 4.0 MB/s
20% randomness -> 2.4 MB/s
50% randomness -> 1.1 MB/s
100% randomness -> 0.6 MB/s


All tests clearly show a maximum throughput at 0% or 1% randomness after which the performance rolls off quickly for higher randomness, usually from 2% randomness onwards.

One would expect the highest throughput for the highest #I/O value, but ... no way, depending on the randomness. #I/O=32 has the best performance for a randomness of 0%, 1%, 20%, 50% and 100%. For the randomness range of 2% -> 10% the highest throughput comes with #I/O=4, and probably the average randomness of 2% -> 10% comes the closest to an average "real live" system configuration. Take now the folder C:\Windows which is the dominant folder in how fast Win7 boots; the Windows folder contains 21GB and 92,000 files - hence average file size 0.22MB, which is a lot smaller than the average track size of 0.8MB on my system drive (so NCQ "could" be doing a good job and shorten the boot time).

Intra-file randomness (on the platters) can be reduced by defragging the folder/boot volume, but file-to-file randomness cannot be detected by any defragger (at least I do not know any defragger like that), for the simple reason that defraggers don't know what the file order is in the booting process. Hence that, although the boot volume is very well defragmented, one could still be facing a large (file-to-file) randomness taking down the "real live" throughput significantly. Bottom line is that I haven't got a clue how good or bad the file-to-file randomness could be on my boot volume. Furthermore, I wouldn't be surprised that after running a defragger, the intra-file randomness has gone down, but the file-to-file randomness could've been increased, thereby (partially) offsetting the gain (shorter boot time) of the defragmenation.

Fortunately, in Win7 (on my notebook) communication between the motherboard and the system drive happens through (U)DMA, although I have to say I have no idea how large the data-chunks are in my system at boot time. It could be that the UDMA (in Win7) plays down (significantly) the importance of randomness in the throughput; In (older) OSs that don't have (or have limited) UDMA, the throughput is likely more sensitive to the file-to-file randomness (I think its fair to say this was already implicitly confirmed by pointertovoid).

Conclusion:
1) jaclaz was/is right; Benchmark results need very careful interpretation when extrapolating them to "real live" situations, as some parameters (like file order in the booting process) are not know by the benchmarkers.
2) Defragging the volume may help improve the booting speed (we all know that already); however, the gain in boot time might be small in Win7 due to UDMA.

johan

This post has been edited by DiracDeBroglie: 20 June 2012 - 08:32 AM


#38 User is offline   jaclaz 

  • The Finder
  • Group: Developers
  • Posts: 11,574
  • Joined: 23-July 04
  • OS:none specified
  • Country: Country Flag

Posted 20 June 2012 - 08:59 AM

View PostDiracDeBroglie, on 20 June 2012 - 08:22 AM, said:

Conclusion:
1) jaclaz was/is right; Benchmark results need very careful interpretation when extrapolating them to "real live" situations, as some parameters (like file order in the booting process) are not know by the benchmarkers.
2) Defragging the volume may help improve the booting speed (we all know that already); however, the gain in boot time might be small in Win7 due to UDMA.

OT :ph34r:, but JFYI.
Many, many years ago, the world motocross champion was a belgian guy, named Georges Jobé:
http://en.wikipedia....
http://web.archive.o...rges/index.html

To a journalist that asked him how he prepared himself phisically (which kind of exercises, jogging, gym, etc.) for the races he replied something like:

Quote

I tend to go on the bike a lot.
Usually 4 hours every morning and a couple more hours in the afternoon.
I find it better for me than the gym.


Well, he was right ;):
Spoiler


jaclaz

#39 User is offline   bphlpt 

  • MSFN Expert
  • PipPipPipPipPipPip
  • Group: Members
  • Posts: 1,117
  • Joined: 12-May 07

Posted 20 June 2012 - 01:52 PM

View PostDiracDeBroglie, on 20 June 2012 - 08:22 AM, said:

Conclusion:
1) jaclaz was/is right; Benchmark results need very careful interpretation when extrapolating them to "real live" situations, as some parameters (like file order in the booting process) are not know by the benchmarkers.
2) Defragging the volume may help improve the booting speed (we all know that already); however, the gain in boot time might be small in Win7 due to UDMA.


You should also look around more here on the board. :) http://www.msfn.org/...yresume-issues/ That is the reason that Andre only recommends the use of MS's own defragging tools. As you found, defragging just to eliminate fragmented files is only part of the solution. File placement is also a factor.

Cheers and Regards

This post has been edited by bphlpt: 20 June 2012 - 01:53 PM


#40 User is offline   pointertovoid 

  • Advanced Member
  • PipPipPip
  • Group: Members
  • Posts: 406
  • Joined: 16-January 09

Posted 21 June 2012 - 03:06 AM

Your figures with random access are now consistent with other disk measurements. I dare to assert that random access measured with IOMeter, within a limited size and with some parallelism, is a better hint to real-life performance than other benchmarks are.

Udma is fully useable with Win95 if you provide the driver. Ncq only needs parallel requests which exist already in Nt4.0 and maybe before - but not in W95-98-Me; it also needs adequate drivers wiling to run on a particular OS. Seven (and Vista?) bring only additional mechanisms to help the wear leveling algorithm of Flash media, keeping write fast over time.

To paraphrase bphlpt: Windows since 98 does know which files are loaded in what order because a special task observes this. Windows' defragger uses this information to optimize file placement on the disk and reduce the arm's movements. Xp and its prefetch go further by pre-loading the files that belong together before Win or the application request them; this combines very well with the Ncq as it issues many requests in parallel, and partly explains why Xp starts faster than 2k with less arm noise.

Share this topic:


  • 3 Pages +
  • 1
  • 2
  • 3
  • You cannot start a new topic
  • You cannot reply to this topic

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users



All trademarks mentioned on this page are the property of their respective owners
Copyright © 2001 - 2013 msfn.org
Privacy Policy