MSFN Forum: Having some serious hardware issues - MSFN Forum

Jump to content



Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic

Having some serious hardware issues Running out of ideas... Rate Topic: -----

#1 User is offline   mongo66 

  • Junior
  • Pip
  • Group: Members
  • Posts: 75
  • Joined: 28-October 08

  Posted 23 December 2008 - 11:49 AM

I've been given the wonderful task of troubleshooting an older machine. Without further ado, here goes...

System Specs:

Athlon XP 2600+ (Barton)
Gigabyte GA-7N400S
2x256mb DDR PC3200 RAM
ATI Radeon 9550 (RV360) AGP
C-Media CMI8738 6CH-PCI
80GB Seagate ATA-100
400W PSU

Windows XP Service Pack 3

Initial observations:

Upon turning on the computer, during bios POST, I noticed the following:

TRAP 00000006  ================== EXCEPTION =================================

tr=0028	cr0=00000011	cr2=00F0836B	cr3=00000000

gdt limit=03FF   base=00017000   idt limit=07FF

cs:eip=0008:00060EC4   ss:eip=0010:00060E6C   errcode=0000

flags=00010002   NoCy   NoZr   IntDis   Down   TrapDis

eax=00000026   ebx=00000FFF   ecx=00000413   edx=0000046B   ds=0010   es=0010
edi=0046A165   esi=00000000   ebp=00060E62   cr0=00000011   fs=0030   gs=0000


This has appeared 3-4 times since I got the computer a week ago. I don't know exactly what it means, but it seems to only appear after changing the memory frequency setting in the bios with dual channel enabled.

On the software side of things, I cannot install Windows if RAM is operating in dual channel mode; the system hangs during text-mode setup (specifically while copying driver.cab). The only solution is to disable it.

I've also swapped the memory modules around, as well as use a single stick of 256mb DDR. The computer doesn't experience any startup issues in single channel mode, whether I use one or both sticks of RAM. On the other hand, Windows will not boot if dual channel is enabled AND memory frequency is configured by SPD (400mhz):

Windows could not start because the following file is missing or corrupt:

\WINDOWS\SYSTEM32\CONFIG\SYSTEM


If I change the memory frequency to 100% or Auto (333mhz) in the bios, Windows starts up just fine with dual channel enabled.

Crash n' burn !!

The crash scenarios mentioned below occur regardless of memory configuration and without any overclocking.

(1) Music playback (soundcard; onboard AC97 disabled)
Occasionally, the mouse, keyboard and screen freezes after a while. Playback however, remains uninterrupted. Computer can still be shutdown (normally) by pressing the power button. Inserting the soundcard into a different pci slot produced the same results. The soundcard isn't faulty, btw.

(2) Music playback (integrated 6-channel AC97 audio)
The system comes to a halt with a loud, constant screeching noise. Computer cannot be shutdown normally; must hold down the power button to power off...

(3) Video playback (WMP, MPC, etc) results in a BSOD after a few minutes.

STOP 0x0000007F (UNEXPECTED_KERNEL_MODE_TRAP)

or,

TRAP_CAUSE_UNKNOWN
STOP 0x00000012 (0x00000001,0x00000000,0x00000000,0x00000000)

*** Note: Using Windows default display drivers, I can *almost* watch an entire movie before any BSOD or system freeze. I also tested the Radeon card in another machine with Catalyst / Omega drivers installed. No video playback issues to speak of... Definitely nothing wrong with the graphics card.

(4) Web browsing -- If I have multiple tabs open, or visit flash intensive web pages, the system either freezes or internet connectivity is lost. The system bogs down and eventually crashes... Only after "forced" shutdown or reset, do things return to normal and I'm able to browse again.

Steps taken (so far):

- Reset bios to default settings
- Ran Memtest86... RESULT: no errors.
- Checked PSU voltages... RESULT: nominal readings.
- Checked harddisk for bad sectors... RESULT: none found.
- Reinstalled Windows XP and updated device drivers with no positive effects...

Also, there are no heat issues as far as I can tell, nor are there any leaks or bulging capacitors on the motherboard itself. The system is already running the latest bios version. After nearly a week of troubleshooting, I suspect the CPU and/or motherboard is "dying"...

What are your thoughts?


#2 User is offline   cluberti 

  • Gustatus similis pullus
  • Group: Supervisor
  • Posts: 11,000
  • Joined: 09-September 01
  • OS:Windows 7 x64
  • Country: Country Flag

Posted 23 December 2008 - 01:23 PM

Let's see a dump file to see if there's anything obvious, or if it looks like hardware.

#3 User is offline   mongo66 

  • Junior
  • Pip
  • Group: Members
  • Posts: 75
  • Joined: 28-October 08

Posted 23 December 2008 - 04:29 PM

After my initial post, the computer seems to have behaved somewhat... I actually managed to shutdown the computer through the Start menu for a change. :P

I've worked with the GA-7N400 series boards before -- both PRO and entry level boards like this one. Never ran into any problems, up until now. With this particular machine, it certainly doesn't take a lot of effort to initiate a system crash... Playing a game of spider solitaire (with no other programs running) causes a hard freeze. :blink:

Something I forgot to mention earlier... While checking system temps, I noticed the CPU was running rather cool, around 42-46c. This is quite unusual for an Athlon XP 2600+ with "stock cooling". Idle temps should be in the 50-55c range. Perhaps the temp sensors have gone mad as well...

Anyways, here are the dump files as requested (16 in total).

Attached File(s)



#4 User is offline   cluberti 

  • Gustatus similis pullus
  • Group: Supervisor
  • Posts: 11,000
  • Joined: 09-September 01
  • OS:Windows 7 x64
  • Country: Country Flag

Posted 23 December 2008 - 05:45 PM

Well, you aren't likely to enjoy reading this analysis, but...
//From the very first dump, I saw something very odd:
kd> !thread
GetPointerFromAddress: unable to read from 8055fbd4
THREAD 817e26a0  Cid 0158.0184  Teb: 7ffd8000 Win32Thread: e1778be0 RUNNING on processor 0
IRP List:
	Unable to read nt!_IRP @ 817ece20
Not impersonating
GetUlongFromAddress: unable to read from 8055fc6c
Owning Process			817afda0	   Image:		 csrss.exe
Attached Process		  N/A			Image:		 N/A
ffdf0000: Unable to get shared data
Wait Start TickCount	  13394		
Context Switch Count	  5316				 LargeStack
ReadMemory error: Cannot get nt!KeMaximumIncrement value.
UserTime				  00:00:00.000
KernelTime				00:00:00.000
Start Address 0x75b67cdf
Stack Init f98a4000 Current f98a39c8 Base f98a4000 Limit f98a0000 Call 0
Priority 15 BasePriority 13 PriorityDecrement 0 DecrementCount 0
ChildEBP RetAddr  Args to Child			  
f98a3998 804f170c 00000100 806f02d0 817e2710 hal!KfLowerIrql+0x17 (FPO: [0,0,0])
f98a39d4 804ecae9 00000000 00000000 00000000 nt!KiDeliverApc+0x118 (FPO: [Non-Fpo]) (CONV: stdcall)
f98a39ec 804e3b7d 804e3a0d e1778be0 00000000 nt!KiSwapThread+0x64 (FPO: [0,0,0]) (CONV: fastcall)
f98a3a24 bf807aec 00000003 817c8af0 00000001 nt!KeWaitForMultipleObjects+0x284 (FPO: [Non-Fpo]) (CONV: stdcall)
f98a3a5c bf89b7c4 00000002 817c8af0 bf89e712 win32k!xxxMsgWaitForMultipleObjects+0xb0 (FPO: [Non-Fpo]) (CONV: stdcall)
f98a3d30 bf884773 bf9aae80 00000001 f98a3d54 win32k!xxxDesktopThread+0x339 (FPO: [Non-Fpo]) (CONV: stdcall)
f98a3d40 bf80110a bf9aae80 f98a3d64 0073fff4 win32k!xxxCreateSystemThreads+0x6a (FPO: [Non-Fpo]) (CONV: stdcall)
f98a3d54 804de7ec 00000000 00000022 00000000 win32k!NtUserCallOneParam+0x23 (FPO: [Non-Fpo]) (CONV: stdcall)
f98a3d54 7c90e4f4 00000000 00000022 00000000 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ f98a3d64)
WARNING: Frame IP not in any known module. Following frames may be wrong.
00000000 00000000 00000000 00000000 00000000 0x7c90e4f4

kd> r
Last set context:
eax=00000000 ebx=804e3a0d ecx=804ecae9 edx=f98a3a24 esi=804e3b7d edi=817e26a0
eip=00000001 esp=f98a3a0c ebp=e1778be0 iopl=1		 nv up di pl nz ac pe cy
cs=3a18  ss=0010  ds=562d  es=0000  fs=0000  gs=7c40			 efl=bf80101d
3a18:00000001 ??			  ???

// Note that the ESP and EBP registers look suspicious - they should be in range of the base limit thread address:
Base f98a4000 Limit f98a0000
esp=f98a3a0c ebp=e1778be0  <-!!

// A dps to the base address:
...
f98a3fc8  8181adf0
f98a3fcc  00000004
f98a3fd0  80562340 nt!ExWorkerQueue+0x80
f98a3fd4  00167398
f98a3fd8  00160168
f98a3fdc  00000000
f98a3fe0  00000000
f98a3fe4  00167398
f98a3fe8  00000040
f98a3fec  001673a0
f98a3ff0  00000000
f98a3ff4  00000000
f98a3ff8  00000000
f98a3ffc  00000000
f98a4000  ????????  <- Base address looks to be corrupt

If you've tested the memory, then this indicates a bad CPU or motherboard (or both) - this wouldn't happen if the hardware underneath windows didn't have some issue, and register issues are almost always CPU-related or motherboard-related problems...

#5 User is offline   mongo66 

  • Junior
  • Pip
  • Group: Members
  • Posts: 75
  • Joined: 28-October 08

Posted 24 December 2008 - 12:50 PM

@cluberti,

Thanks for your analysis, I really appreciate it :) The analysis seems to have confirmed my initial suspicions...

View Postcluberti, on Dec 24 2008, 12:45 AM, said:

If you've tested the memory, then this indicates a bad CPU or motherboard (or both) - this wouldn't happen if the hardware underneath windows didn't have some issue, and register issues are almost always CPU-related or motherboard-related problems...

I did test the memory, albeit for a couple of hours. Due to recurring system crashes, I figured if the memory was at fault, memtest86 would have detected errors in a short amount of time... I'll test the RAM again later today -- just to be sure. Only this time, I'll let it run for 12 hours.

Other interesting developments...

As I was using the machine today, it froze up again (no surprise there). After a forced shutdown and restart, I was greeted with:

NTLDR is missing ... 
Press Ctrl-Alt-Del to restart

WTF? So I fired up BartPE just to see what was going on... I couldn't access drive C: but was able to access the logical drive (D:). Ran chkdsk on drive C: with the following results:

CHKDSK.CMD: Starting...

Please enter the drive, mount point or volume name to check (for example "c:")..
.
Enter drive:c:

Do you want to fix errors on the disk *and*
locate bad sectors and recover readable information (Yes/No)...
Enter "y" or "n":n

Do you want to fix errors on the disk (Yes/No)...
Enter "y" or "n":y

----------------------------------------------------------------
You have specified to check drive/volume c:

With the following options:
- Fix errors on the disk
----------------------------------------------------------------

Start check disk? (Yes/No)...
Enter "y" or "n":y
Running: chkdsk.exe c: /f
The type of the file system is NTFS.

CHKDSK is verifying files (stage 1 of 3)...
File verification completed.
CHKDSK is verifying indexes (stage 2 of 3)...
Correcting error in index $I30 for file 5.
Correcting error in index $I30 for file 5.
Sorting index $I30 in file 5.
Index verification completed.
CHKDSK is recovering lost files.
Recovering orphaned file $MFT (0) into directory file 5.
Recovering orphaned file $MFTMirr (1) into directory file 5.
Recovering orphaned file $LogFile (2) into directory file 5.
Recovering orphaned file $Volume (3) into directory file 5.
Recovering orphaned file $AttrDef (4) into directory file 5.
Recovering orphaned file . (5) into directory file 5.
Recovering orphaned file $Bitmap (6) into directory file 5.
Recovering orphaned file $Boot (7) into directory file 5.
Recovering orphaned file $BadClus (8) into directory file 5.
Recovering orphaned file $Secure (9) into directory file 5.
Recovering orphaned file $UpCase (10) into directory file 5.
Recovering orphaned file $Extend (11) into directory file 5.
Recovering orphaned file SYSTEM~1 (27) into directory file 5.
Recovering orphaned file System Volume Information (27) into directory file 5.
Recovering orphaned file WINDOWS (30) into directory file 5.
Recovering orphaned file ntldr (3376) into directory file 5.
Recovering orphaned file NTDETECT.COM (3380) into directory file 5.
Recovering orphaned file boot.ini (3413) into directory file 5.
Recovering orphaned file PROFILES (3419) into directory file 5.
Recovering orphaned file VOLUMEID.EXE (3722) into directory file 5.
Recovering orphaned file PROGRA~1 (3732) into directory file 5.
Recovering orphaned file Program Files (3732) into directory file 5.
Recovering orphaned file RECYCLER (9223) into directory file 5.
Recovering orphaned file pagefile.sys (9245) into directory file 5.
Recovering orphaned file symstore (9318) into directory file 5.
CHKDSK is verifying security descriptors (stage 3 of 3)...
Security descriptor verification completed.
CHKDSK is verifying Usn Journal...
Usn Journal verification completed.
Correcting errors in the Master File Table (MFT) mirror.
Correcting errors in the Volume Bitmap.
Windows has made corrections to the file system.

  15631213 KB total disk space.
   2820732 KB in 11067 files.
	  3200 KB in 1470 indexes.
		 0 KB in bad sectors.
	 92901 KB in use by the system.
	 65536 KB occupied by the log file.
  12714380 KB available on disk.

	  4096 bytes in each allocation unit.
   3907803 total allocation units on disk.
   3178595 allocation units available on disk.
Unable to obtain a handle to the event log.

CHKDSK.CMD: Check disk done...
Press any key to continue . . .

What else could go wrong?! lol. Luckily enough, chkdsk fixed the errors and I was able to get into windows...

#6 User is offline   cluberti 

  • Gustatus similis pullus
  • Group: Supervisor
  • Posts: 11,000
  • Joined: 09-September 01
  • OS:Windows 7 x64
  • Country: Country Flag

Posted 26 December 2008 - 10:02 AM

Well, an unclean shutdown during any NTFS flush operation would cause it, so if you had to kill power at a "bad" time that could do it (and since chkdsk cleaned it, that's a very likely scenario).

#7 User is offline   mongo66 

  • Junior
  • Pip
  • Group: Members
  • Posts: 75
  • Joined: 28-October 08

Posted 28 December 2008 - 03:29 PM

I've completed another round of memory tests over the weekend. Memtest86 had been running for almost 16 hours without a single error. Therefore, it's safe to assume the RAM modules aren't faulty. These tests were carried out using the default Front Side Bus (FSB) of 166Mhz (no overclocking). Note: I did not run any tests with dual channel enabled, as I felt it was unnecessary.

DRAM Frequency : By SPD   << Bios default setting
Memory timings : RAM 200Mhz (DDR400) / CAS : 2.5-3-3-8 / Single Channel (64-bits)

With all the problems I've encountered with this machine including the minidump analysis, I'm forced to conclude either the processor, motherboard or a combination thereof, has gone "bad".

Now I have to figure out how to explain this to the owner of the computer when I return it to him tomorrow.

Case closed.

#8 User is offline   cluberti 

  • Gustatus similis pullus
  • Group: Supervisor
  • Posts: 11,000
  • Joined: 09-September 01
  • OS:Windows 7 x64
  • Country: Country Flag

Posted 31 December 2008 - 01:03 PM

With my dump analysis above and your thorough memory test, I think it's pretty safe to say that should be enough.

Share this topic:


Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users



All trademarks mentioned on this page are the property of their respective owners
Copyright © 2001 - 2011 msfn.org
Privacy Policy