How is Data Written, Stored On, and Erased From Hard Disks?
by Steve Burgess
One of my favorite IT Directors, Buzz Eyler of the Orcutt Unified School District,
tells me that, "Most people have no clue how data is stored on a hard drive running
Windows. A discussion of how it is written and marked for erasing would help a lot
of people understand what's happening under the hood of their computer."
First, a little background: Inside your hard disk is a stack of one or more
optically perfect platters where data is stored magnetically. When the drive is
originally formatted, it is laid out in a pattern of concentric circles
("cylinders") and wedges. Try to imagine a hybrid of a record album and a pizza
pie... or a dartboard. However, rather than 8 slices of pizza, or about 80
places big enough to land your dart, there may be hundreds of millions of
extremely small "Sectors."
A Sector is 512 "bytes" in size - or big enough to hold about 256 characters.
Windows chunks these out into "Clusters", each of which holds about 64 Sectors.
Every time you create a file, Windows sets aside - "allocates" - at least one
Cluster, and then writes your data to it. Whenever a file exceeds one Cluster in
size, the computer allocates another entire Cluster. But even if a file consists
of one letter, which is 2 bytes in size, the computer allocates approximately
32,000 (actually 32,768) bytes of space.
The file may then be written to only the first 2 bytes of the Cluster, leaving
the great majority of the Cluster unchanged, as "file slack." The Cluster won't
be assigned to another file until the original file is deleted - that is, until
the original is sent to the Recycle Bin, and the Recycle Bin emptied.
But this one Cluster isn't the only place to which your data is written.
Furthermore, where and in how many places data is written can be somewhat
dependent upon the application writing it.
When a file is saved, there are several attributes saved with it. One is the
date the file was created; one is the date the file was last changed, or
modified; one is the date the file was last accessed. This information is kept
as part of a file listing called a "directory." This directory is viewed by the
user as the contents of a folder.
Let us take for example, Microsoft Word, the leading word processing program for
office computers. As soon as the user begins a Word document, an invisible,
temporary work file is created (call it "Work File A"), and parts of the new
document get written to the virtual memory file (which in Windows XP, is called
pagefile.sys). We can call it the "VM file."
When the user saves the document, a file is created on the hard disk with the name
the user gives it; call it "User Document." We think we have created one document,
but the data we're typing is going into three separate files. If we close the
document, "Work File A" is deleted, but it doesn't go away - more on this later.
|