WIN32_FIND_DATA and negative file sizes.
July 13, 2010
UPDATE: Thanks to the comment from Bart, I found a problem with my approach. I used the pInvoke.net example in my code and they defined the nFileSizeHigh and nFileSizeLow as integers. This is why there was so much overflow. They should have been defined as unsigned int to allow for a greater range. I am going to leave my solution untouched below because it worked and it is a good learning experience. However if you ever need to use WIN32_FIND_DATA make sure you are using unsigned integers and the the MSDN solution should work fine. Thanks again to Bart.
————————————————————————————————
ORIGINAL POST BEGINS:
I recently had the job of writing an application using the .NET Framework for work that would walk through the folders at a given location and return the total of all the file sizes so we can track changes in certain directories. I quickly found out that the .NET Framework does not allow for folder paths that are beyond 260 characters and you get an exception if you try. Well that is a pain.
So working from an example from pinvoke.net(read it here) I started building an app based on the Windows API. This was completely new territory for me and some surprises were encountered.
WIN32_FIND_DATA is the data structure that is passed into the FindFirstFile and FindNextFile methods and is filled with the found files information. The two properties I was examining were nFileSizeLow and nFileSizeHigh to get the file sizes. But I noticed that for some files I was getting a negative file size returned. Huh?!?
MSDN says that to get the correct file size you need to get the file size this way: (nFileSizeHigh * (MAXDWORD+1)) + nFileSizeLow
Tried that and still got negative file sizes. WTF? I tried a Google search to see if anyone had encountered this. There were a few suggestions but after trying them I was still getting negative file sizes for really large files. I finally found a solution that worked even when it encountered files that were 10GB in size.
Below is my explanation of the why and my solution. This worked great for me but if any one encounters this and has suggestions please let me know in comments. I figured this was worth the post since I haven’t seen this solution yet.
Explanation:
WIN32_FIND_DATA holds the file sizes in the nFileSizeLow property in bytes. nFileSizeHigh is meant to be greater than zero if the file size is greater than the data type can hold. In this case the properties hold a DWORD, Windows version of a 32-bit signed unsigned integer (EDIT: originally this as incorrect).
The maximum for a signed 32-bit integer is (2^31) - 1 = 2,147,483,647. Image 1.1 shows the bit pattern for this.
So if we add one to this number we get the bit pattern that is shown in Image 1.2.
We get the result of -2147483648 when we should get +2147483648. To rectify this we need to add back in the correct value. This would be 4294967296 or 2147483648 * 2.
There are four possible outcomes for WIN32_FIND_DATA’s file size info
1. nFileSizeLow < 0 and nFileSizeHigh > 0
2. nFileSizeLow > 0 and nFileSizeHigh > 0
3. nFileSizeLow < 0 and nFileSizeHigh !> 0
4. nFilesSizeLow is > 0
So MSDN’s solution does not account for all possibilities. Below is my solution.
notes:
- 4294967296 = [(AbsVal|32bit signed int max|)+1] * 2 = [([2^31]-1)+1] * 2 = 2147483648 * 2
- +2147483647 = 32bit signed int max ——-> 0111 1111 1111 1111
- -2147483648 = 32bit signed int max + 1 —> 1000 0000 0000 0000 –> sign change! Should be +2147483648
- -2147483648 + 2147483648 + 2147483648 = -2147483648 + (2147483648 * 2) = -2147483648 + 4294967296 = 2147483648 –> sweet!
The WIN32_FIND_DATA struct looks like this:
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)]
public struct WIN32_FIND_DATA
{
public FileAttributes dwFileAttributes;
public FILETIME ftCreationTime;
public FILETIME ftLastAccessTime;
public FILETIME ftLastWriteTime;
public int nFileSizeHigh; //EDIT: this should be a UINT, not INT
public int nFileSizeLow; //EDIT: this should be a UINT, not INT
public int dwReserved0; //EDIT: this should be a UINT, not INT
public int dwReserved1; //EDIT: this should be a UINT, not INT
[MarshalAs(UnmanagedType.ByValTStr, SizeConst = MAX_PATH)]
public string cFileName;
[MarshalAs(UnmanagedType.ByValTStr, SizeConst = MAX_ALTERNATE)]
public string cAlternate;
}
I have declared this as findData below.
My solution:
//store nFileSizeLow
long fDataFSize = (long)findData.nFileSizeLow;
//store individual file size for later accounting usage
long fileSize = 0;
if (fDataFSize < 0 && (long)findData.nFileSizeHigh > 0)
{
fileSize = fDataFSize + 4294967296 + ((long)findData.nFileSizeHigh * 4294967296);
}
else
{
if ((long)findData.nFileSizeHigh > 0)
{
fileSize = fDataFSize + ((long)findData.nFileSizeHigh * 4294967296);
}
else if (fDataFSize < 0)
{
fileSize = (fDataFSize + 4294967296);
}
else
{
fileSize = fDataFSize;
}
}


August 25, 2010 at 12:47 am
DWORD should not be interpreted as int, but as uint. The WIN32_FIND_DATA struct should use uint in stead of int for DWORD values.
DWORD is a 32-bit unsigned integer. The range is 0 through 4294967295 decimal.
INT is a 32-bit signed integer. The range is -2147483648 through 2147483647 decimal.
UINT is an unsigned INT. The range is 0 through 4294967295 decimal.
March 30, 2011 at 2:20 pm
Thanks for the reply Bart. After I read your post I definitely face palmed myself. Chalk it up as a learning experience.