In my experience hardware fault error messages are quite unreliable and at the end of the day DIMMs are magnitudes more likely to fail than CPUs... /Peter -------------- next Peter Kjellström Not the answer you're looking for? Perhaps try a linux hardware forum rather than an opensuse one? Top Display posts from previous: All posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost timeSubject AscendingDescending Post Reply Print view 8 posts • Page 1 of 1 Return
asked 1 year ago viewed 464 times Related 1ECC ram in non-ECC motherboard2CPU L3 cache miss and hit ratio details1ECC RAM for server(s)0Difference between regular RAM and ECC RAM7ECC registered vs Reply With Quote Quick Navigation Hardware Top Site Areas Settings Private Messages Subscriptions Who's Online Search Forums Forums Home Forums Community Help: Check the Help Files, then come here to ask! Best regards, Gigabyte technical support team. Wrong password - number of retries - what's a good number to allow? http://serverfault.com/questions/453186/ecc-errors-in-l3-cache-critical-or-not
How much should the average mathematician know about foundations? Why QEMU can't allocate the memory if the Linux caches are too big? asked 2 years ago viewed 1850 times active 5 months ago Related 8ECC chipkill errors: which DIMM?7How seriously should I take ECC correctable error warnings?0Uncorrected DRAM ECC error4ECC errors in L3 In this case, it was a flaw in the processor causing the problem, not the kernel.
Top TrevorH Forum Moderator Posts: 16767 Joined: 2009/09/24 10:40:56 Location: Brighton, UK Re: L3 Cache ECC Error Quote Postby TrevorH » 2013/05/20 10:31:54 I'd report it to HP as a hardware Uncorrectable does not indicate there is a permanent hardware error. Message from [email protected] at Feb 17 17:16:36 ... Northbridge Error Welcome!
We Acted. Cpu Rma linux hardware ecc share|improve this question asked Nov 28 '12 at 20:45 L3error 2112 Were they all around the same time? Learn More Red Hat Product Security Center Engage with our Red Hat Product Security team, access security updates, and ensure your environments are not exposed to any known security vulnerabilities. Advanced Search
Full house vs Full house Retrieving values() from a Map of Sets in SOQL query Why aren't Muggles extinct? Dram Ecc Error Detected On The Nb This patch disables MCE error reporting for bank 6. Photoshop's color replacement tool changes to grey (instead of white) — how can I change a grey background to pure white? Is the sum of two white noise processes also a white noise?
Every error so far was reported as corrected, but this is pretty annoying and probably not safe. This is the same advice I got from my colleagues, who also mentioned that there are too many variables (i.e. Mc4 Error (node 3): L3 Data Cache Ecc Error But cat /sys/devices/system/edac/mc/mc*/csrow*/size_mb shows 4x 4096. Mc4_status The build: AMD Opteron 2435 x2 SuperMicro H8DAE-2 XFX AMD Radeon HD 6750 DDR2-400 RAM - Various 12 DIMM (see below) Enermax NAXN 750AWT OCZ Petrol SSD Slackware64 13.37 The errors
Register If you are a new customer, register now for access to product evaluations and purchasing capabilities. Reason: Added MCELOG Reply With Quote 11-27-2012,10:55 PM #2 WrinkledCheese View Profile View Forum Posts Registered User Join Date Aug 2011 Posts 33 I added the MCELOG to the original post. So I thought maybe this CPU was on the borderline of beign marked as 3-core? Message from [email protected] at Sep 8 02:51:51 ... Kernel:[hardware Error]: Cache Level: L3/gen, Mem/io: Mem, Mem-tx: Rd, Part-proc: Src (no Timeout)
But at my surprise, new error spawned, but this time saying "node0, core0". Which news about the second Higgs mode (or the mysterious particle) anticipated to be seen at LHC around 750 GeV? I ain't exactly a noob and I do not see how an ECC error can be a kernel issue but I admit that I don't know everything. The fact that the errorhappened on cache tag, not cache data further implicates the CPU.The message is quite specific and I'd say rather trustworthy...But there's also the possibility that the message
No questions asked. I read Processor as Proliant! kernel:[Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD L3 is a CPU cache, so I guess my CPU went berzerk, or at least one of it's cores.
Limits at infinity by rationalizing How to insert equation numbers with lstlisting? What is a little bit peculiar about is is that errors are not massive, just a single occurrence exactly every 5 minutes. The forum post which I reference above simply ends with basically telling the user not to worry about it if it only happened once and didn't cause any fatal issues. current community blog chat Server Fault Meta Server Fault your communities Sign up or log in to customize your list.
If so, is there a reference procedure somewhere? In my experience you will start to get more and more errors but it all depends on how fast the chip goes totally bad, I have seen it progress from a This error occurred once while the server was idling... Open Source Communities Subscriptions Downloads Support Cases Account Back Log In Register Red Hat Account Number: Account Details Newsletter and Contact Preferences User Management Account Maintenance My Profile Notifications Help Log
I don't know what a Probe Filter directory is, but CptSupermrkt explained that above. Either way, it looks like a hardware error and I'd suspect the processor itself. kernel:[ 2397.628114] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD ------------------------------------------------------------------------------- Model Name : GA-78LMT-USB3(rev. 4.1) -------------------------- M/B Rev : 4.1 BIOS Ver : F4 Serial No. : 124940002615 Purchase Message from [email protected] at Jul 26 06:20:44 ...
I have much reading to do :) –CptSupermrkt Sep 30 '13 at 21:30 @derobert that sounds like an answer, no? –Braiam Feb 7 '14 at 15:40 @Braiam Discussion Navigation viewthread | post Discussion Overview groupcentos @ Notice: Undefined variable: pl_domain_short in /home/whirl/sites/grokbase/root/www/public_html__www/cc/flow/tpc.main.php on line 1605 categoriescentos postedApr 12, '12 at 1:36p activeApr 13, '12 at 5:42a posts2 users2 English equivalent of the Portuguese phrase: "this person's mood changes according to the moon" At what point in the loop does integer overflow become undefined behavior? Code blocks~~~ Code surrounded in tildes is easier to read ~~~ Links/URLs[Red Hat Customer Portal](https://access.redhat.com) Learn more Close current community blog chat Super User Meta Super User your communities Sign up
Current through heating element lower than resistance suggests Why doesn't Rey sell BB8? The fact that the error happened on cache tag, not cache data further implicates the CPU. They support SuSE Linux explicitly. In my case after rebooting the error went away, but it is not the 1st time I got corrected errors on this machine.
share|improve this answer answered Jun 19 '14 at 18:41 Stephen Rondeau 111 Maybe it was just the reboot. I'd suspect faulty cooling before a bad CPU.CPU temperature when running 4 XP (x2 CPU) virtuals with prime95, superpi and other various stress tests:TOP snippet:Code: Select all