On CBS MoneyWatch: The perfect car for a teenager
BNET Business Network:
BNET
TechRepublic
ZDNet

Category: Intel

March 26th, 2008

55W PC power supply powering the dual-core computer

Posted by George Ou @ 12:35 am

Categories: Build it yourself, Desktop, Energy efficiency - green, Fun Stuff, Hardware, Intel, Networking, Security

Tags: Dual-core, PC, Power Supply, Computer, George Ou

Most computer builders in the world think I’m nuts for endorsing the use of 330 watt power supplies for a high-end performance computer.  Conventional “wisdom” says that anything under 500 watts is inadequate for an enthusiast PC.  “My power supply is bigger than your power supply” seems to be a typical mindset for many people but I’ve always had just the opposite desire to say that “my supply is smaller than yours and it works great”.  So when I started building mainstream dual-core computers with 220 watt 80 Plus power supplies, people were shocked that I would even consider such a small power supply.  But since I was able to build a 50W peak power dual-core computer, why not use an even smaller power supply in the sub-100 watt range?

FSP055-50LM SPI 55 watt open frame power supply

Pictured above is the open frame fanless AC input open frame 55 watt FSP055-50LM power supply from Sparkle Power Inc with an MSRP of $39.  Typically when power supplies are this small, people often use DC input power supplies with an external AC brick.  Not so with this model as it’s an all in one with the standard AC power connector you get on a normal ATX PC power supply.  It’s so small that it doesn’t even bother with a fan or metal casing; you have to a system-level fan yourself and provide the bracing and shielding in your computer chassis.  The really nice thing about this solution is that the entire power supply including the AC conversion part is not much bigger than a DC power supply but you don’t need an external brick.

Using this 55W power supply, I took a dual-core Intel E2140 along with the bundled ECS945-GM motherboard I bought for $90 and built a computer with it using default clock speed and voltages.  Unfortunately since it was missing a 4-pin power connector for the motherboard, I had to hot-wire a 4-pin CPU power connector from an older power supply to this unit to make it work.  That means 2 12-volt yellow cables and 2 black ground cables had to be soldered in to place and taped up.  Since these cables are safe for 10 amps each which translates to 120 watts per cable, I’m not even close to overloading the cables.

Once the computer came up, the power consumption at the plug peak out at 70W which means the output power is around 52W at 75% efficiency which is 3W under the peak output of the power supply.  That is cutting it a bit close but it shows the extreme worst-case of what this PSU can handle.

In reality, the 55W PSU isn’t practical for a mainstream dual-core computer although it would be more than powerful enough for an Intel D201GLY with Celeron 115, D201GLY2 motherboard with Celeron 120, or the Via low-power ITX platforms. The upcoming Intel Centrino Atom platform with the Atom-Diamondville CPU peaks at around 4W TDP so they’re even easier to power.

The bottom line is that this is a nice little power supply for small embedded solutions but you’ll want to stick with the bigger 80 Plus closed-frame models like the Sparkle SPI220LE 220W or the SPI270LE 270W if you’re building a mainstream PC.  Note that the SPI models are 1U power supplies so you’ll either need a very custom case or one that uses 1.75″ thin power supplies.

March 10th, 2008

Early photos of AMD Shanghai CPU

Posted by George Ou @ 5:18 am

Categories: AMD, Energy efficiency - green, Hardware, Intel, Processors, Servers

Tags: Shanghai, Photograph, Advanced Micro Devices Inc., CPU, Processors, Semiconductors, Hardware, Components, George Ou

Credit: Fuad Abazovic, Fudzilla

Photos of CPU-Z highlighting AMD’s 45nm Shanghai quad-core processor appeared on Fudzilla last week.  It confirms that AMD’s latest processor will have a total of 2 megabytes L2 cache (512 KB per core), and 6 megabytes of shared L3 cache.

By contrast, AMD’s 65-nm Barcelona-class processors (Phenom and Opteron quad-core) only have 2 megabytes of shared L3 cache.  The L2 and L3 caches will mostly be exclusive which means they will for the most part not share any content effectively making the cache size larger.

Shanghai’s core voltage of 1.15 V is equivalent to the low-voltage edition of AMD’s current 65nm quad-core processor Barcelona though it’s unclear if this particular Shanghai was operating at normal or low voltage.  According to Fuad Abazovic of Fudzilla, Shanghai is expected to operate above the 3 GHz mark though the CPU-Z photo has the clock speed left out.  We also need to put this in the context of Barcelona having a targeted clock speed of 2.8 GHz according to papers presented at ISSCC 2007 though actual production speeds have yet to exceed 2.3 GHz.

One other interesting note is that AMD’s Montreal 8-core processor due out after Shanghai will resort to MCM (Multi Chip Module).  Montreal will be two Shanghai cores glued on to a single processor package.  That means AMD will be adopting the same strategy Intel has been using on its 65nm and first-generation 45nm processors where you take two smaller cores and “glue” them on to a CPU package to have more cores per processor.  Ironically, Intel will be going the opposite direction starting with Intel Nehalem.  Not only will the initial Nehalem-EP 8 MB L3 cache quad-core processor be single-die, but even the much larger Nehalem-EX 8-core processor with 24 MB L3 cache will be single-die.  So in 2009, watch for both companies to reverse their marketing literature touting or disparaging MCM “glue” technology.

March 7th, 2008

Asus' 8.9" Eee draws crowds at CeBIT

Posted by George Ou @ 12:57 am

Categories: Energy efficiency - green, Fun Stuff, Hardware, Intel, Linux, Microsoft, Mobile/Wireless, News, Video Conferencing

Tags: ASUS, Webcam, CeBit, Bottom Line, Flash Memory, Microsoft Windows, Microsoft Windows XP, Linux, Operating Systems, Software

Here in CeBIT 2008, crowds descended on Hannover Germany to see the latest technologies. Germany is certainly a lovely country but there’s nothing lovable about the 5.60 Euro per gallon gas prices.

CeBIT is certainly one of the more unique conventions I’ve been to since everything is spread out over a square kilometer and it’s like going to 10 mini conventions. While you get some outdoor air between the halls, don’t expect any fresh air with all the smokers there. The temperature delta certainly makes proper attire a challenge because it’s too warm inside and freezing outside.

Asus had a massive presence in building 26 which is one of the more popular spots at CeBIT and they managed to draw crowds wanting to get a closer look at the new and improved 8.9″ Asus Eee PC. The new 8.9″ Asus Eee comes with more SSD flash storage, a bigger LCD screen with 1024×600 resolution, a better quality webcam. The same Pentium M 900 MHz CPU is the same as the original Eee. [See gallery for a close-up view.]

The Windows XP model comes with 8 GBs of SSD flash memory when the Linux model comes with 12 GB of SSD flash memory. So far we only know that the price will be 399 Euros (which typically means it will be fewer in dollars for the US market), but we don’t know if there will be a price difference between the Linux and Windows XP model. It is possible that the price of the flash memory offsets the licensing costs of Windows XP.While holding the lightweight Eee with one hand, I tested the quality of the Mic and the Webcam and confirmed that the quality if fairly good. The Webcam is definitely much better quality than the old Eee. The Eee also comes with a wired 10/100 Ethernet port as well as 802.11g. The one down side to the Eee is that it doesn’t have a DVI output and instead has a DB-15 VGA port.

Here’s a comparison of the older 7″ Asus Eee versus the 8.9″ Eee. As you can see, the screen is much bigger and the color and contrast appears to be much better. The speakers had to be moved to the bottom of the laptop because the bigger screen pushed them off the lid. You can also see that the track pad is also larger.

george-eee-2.jpgI wouldn’t doubt if people buy the 12 GB Linux version and use NLite to install a trimmed down version of XP though having Linux on this device is extremely useful if you’re going to use it as a security auditing tool. The 8 GBs of SSD is more than enough to hold the OS and key applications and a $60 16 GB SDHC card is more than sufficient to hold plenty of movies and data. With the larger screen and nicer webcam and adequate microphone, it becomes a great Skype video conferencing solution. The bottom line is that the Asus Eee is very pleasing in the hands and it runs Windows XP very quickly if you keep bloatware/crapware off of it.

March 2nd, 2008

Intel christens Silverthorne as "Atom"

Posted by George Ou @ 9:02 pm

Categories: Build it yourself, Consumer electronics, Desktop, Energy efficiency - green, Hardware, Intel, News, Processors, Servers, Storage, Virtualization

Tags: Anandtech, Intel Corp., Silverthorne, Atom Logo, Intel D201GLY2, Processors, Chipsets, Semiconductors, Hardware, Components

Intel has officially announced its new branding for the “Silverthorne” processor and the “Menlow” platform.  The Silverthorne processor will be called the “Intel Atom”.  The Menlow platform will be called “Intel Centrino Atom”.  The Intel Atom processor will be used in the Intel Centrino Atom platform.  The new Atom logos are shown below.

Intel released technical details of the new Silverthorne processor last month at ISSCC 2008.  This latest announcement gives Silverthorne and Menlow their official branding and their official logos.  Intel also released high resolution die shots at the right hand side of their press release.  A cut down rotated version of the die shot is shown below.

Here’s a summary of the new “Atom” processor:

  • Equivalent on single-threaded performance to original Pentium M “Banias” processor.  Faster if SSE3 instructions are used in the application or if multiple threads are involved.
  • 0.6W TDP (Thermal Design Power) to 2.5W TDP
  • Up to 1.8 GHz and DailyTech says sources inside Intel are saying that the 500 MHz version goes down to 0.6W TDP.
  • Idle power consumption can drop as low as 0.01W to 0.1W
  • Deep power down C6 state
  • Optimized register-file and cache 6T bits cells
  • CMOS mode on quad-pumped FSB IO
  • Split IO power supply
  • Single CPU core 2-issue in-order pipeline
  • SMT (Symmetric Multithread) architecture
  • 25mm^2 die size (2500 CPUs per 300mm diameter wafer)
  • Can achieve 2GHz core frequencies at 1.0V
  • Intel VT (Virtualization Technology)
  • Intel 64 architecture (formerly EM64T and compatible with AMD64)

Intel’s press release also mentions the processor codenamed “Diamondville”.  DailyTech reported some leaked information that Diamondville would be released in a single and dual-core version at 4W and 8W TDP.  Diamondville will be soldered on to an Intel 945GSE chipset motherboard and judging from the photo, it looks to be a replacement for the D201GLY and D201GLY2 developing market platforms.  The Intel D201GLY2 uses a lower power Celeron 220 (Core Solo architecture) with a TDP of 17W so Diamondville is a huge boost in energy efficiency.  The current D201GLY and D201GLY2 also utilizes a third party SIS chipset which doesn’t support S3 sleep/suspend states while the Diamondville 945GSE platform will.

Given the fact that it’s highly unlikely (too expensive) that Intel would design a whole separate CPU for this type of a solution, it is very possible that Diamondville is simply a soldered-on-motherboard derivative of Silverthorne and the dual-core version is simply an MCM (Multi Chip Module) version of Silverthorne.  AnandTech’s Anand Lal Shimpi seems to agree with this theory and goes on to explain that the slightly higher TDP with slightly lower 1.6 GHz clock is simply due to a higher voltage allowing for much higher yields.  Since this is for the low-cost value market segment, that theory makes a lot of sense.

At present time Intel seems to be hinting that Diamondville will also carry the “Atom” branding but they’re vague on the specifics.  What is certain is that the emerging market will enter in to a whole new level of energy efficiency and the appliance/embedded do-it-yourselfers like me are drooling over Diamondville’s power specifications.

February 24th, 2008

Leaked Intel Nehalem performance projections over AMD Shanghai

Posted by George Ou @ 1:41 am

Categories: AMD, Hardware, Intel, News, Processors, Servers, Sun, Workstations

Tags: Performance, Shanghai, Advanced Micro Devices Inc., Intel Corp., Performance Management, Processors, Human Resources, Workforce Management, Semiconductors, Hardware

It appears that the rumors about Intel’s next major microprocessor “Nehalem” being a huge juggernaut may be true according to leaked documents from Sun Microsystems (removed Sunday night).  The slides appear to be inadvertently placed on Sun’s publicly accessible website and “jokerman” posted the link on Aceshardware (thanks to tip from ZDNet reader JumpingJack).  The slides looks like the real thing meant for Intel’s partners and they’re probably well known in the server industry.

Reliable sources have reported in the past that Intel’s Nehalem processor will have three channels of DDR3 memory per CPU versus two channels of DDR2 memory per AMD Barcelona or upcoming Shanghai processor.  That would mean that AMD’s massive memory bandwidth advantage will turn in to a large memory bandwidth.  So what does this mean for Intel Nehalem’s performance?  Take a look at the following charts I generated after carefully measuring the length of the performance bars on a pixel level.

Since Intel’s charts were normalized to an Intel E5160 dual-core processor on SPECint_rate_base2006 and SPECfp_rate_base2006, I had to start somewhere and make some guesses on the base performance.  I used Intel’s highest published SPEC CPU integer and floating point score of 60.8 and 45.1 for the E5160 processor as of 2/23/2007.  This is probably not the exact reference point that Intel used so the numbers might be off a little.

When I compared my extrapolated numbers to the published SPECint scores for all of the shipping products other than the E5160, I found that Integer performance was 2% to 7% too low and the average was 4%.  When I compared with published SPECfp scores other than the E5160, I found that my extrapolated numbers were all 4% too high for all models except the Opteron 2220 extrapolation which was 12% too high.  To adjust for this, I raised the SPECint estimates 4% and dropped the SPECfp estimates 5% and generated the following chart which is a closer match to the published scores.

I tend to believe that the second adjusted chart is more accurate.  We’ll most likely know by the end of this year what the actual scores are, but I doubt they will be more than 5% to 10% off from these estimated projections.

So how can Intel pull off such a massive performance boost over their current reigning champion “Harpertown” X5482 processor?  Consider the fact that Intel’s current generation 45nm Harpertown processors lead the benchmarks despite the memory bandwidth disadvantage because of a much faster execution engine and larger cache.  Then we factor in the fact that Intel will implement SMT (dual threads per CPU core), improve the already-fast execution engine of Harpertown, and feed it with three channels of DDR3 memory per CPU instead of the old shared front side bus.  AMD’s Shanghai on the other hand is essentially a die shrink, a cache size boost, and a clock speed boost.  Taking all these things in to consideration would easily explain how Intel could widen the lead so far.

I would also note that Intel’s leaked slides compare these processors in pairs where the Opteron 2220 DC (Dual Core) faces off with the Intel E5160 DC processor and the 2222 faces the X5365.  These two pairs represent a snapshot in time to when the products competed against each other.  The last two pairings on top may be generous to AMD since Barcelona processors aren’t shipping yet because of the TLB bug whereas Intel launched the X5482 in November 2007.  AMD’s Shanghai processor didn’t have first silicon until four months after Intel showed off their first silicon at spring IDF 2007 in September, but the difference is that Intel has showed the Nehalem running a real Operating System while AMD has not done the same for Shanghai.

Since it usually takes one year from first silicon to production parts, it’s a bit hard to believe that Shanghai will ship at the same time as Nehalem.  But even if it does ship at the same time as Nehalem, the competition from Intel looks very daunting if these estimates are anywhere close to being accurate.

February 4th, 2008

ISSCC 2008: Details on Intel Silverthorne

Posted by George Ou @ 4:36 am

Categories: Energy efficiency - green, Hardware, Intel, Mobile/Wireless, News, Processors

Tags: HyperThreading, Smart Phone, Power Consumption, Intel Corp., Intel Silverthorne, Smart Phones, E-mail, Ultramobile PCs (UMPCs), Processors, Consumer Electronics

At this year’s ISSCC 2008 (International Solid State Circuits Conference), details of Intel’s new 45nm Silverthorne will emerge.  Intel CTO Justin Rattner held a press briefing last Wednesday to preview some of the highlights of this week’s highly technical ISSCC conference in San Francisco.


Credit: Intel Corporation (from ISSCC preview presentation)

Intel Silverthorne is a brand new Intel x86 processor for the Menlow platform developed from the ground up for low-cost and ultra-low power applications.  This includes UMPC (Ultra Mobile PCs), MID (Mobile Internet Devices), set-top applications, some embedded applications, and eventually for smart phone applications though this initial generation may not be suitable yet.  Its small 25mm^2 die size on a 45nm process allows 2500 chips to fit on a single 300mm diameter wafer allows for extremely economic production.

From Rattner’s press conference last week, we know that Silverthorne will launches in the first half of 2008 but Rattner will not give a yes on a Q1 launch in response to one of the questions.  The first Silverthorne dies were publicly shown in April of 2007 in IDF China so it’s quite possible that we’re looking at a second quarter launch.  Rattner also explained that Silverthorne was a dual-issue in-order pipeline architecture with HT (Hyper-threading) and that this was better than hyper-threading in out-of-order architecture.  I later got verification via email that the HT type was SMT (Simultaneous multithreading) and not SoEMT (Switch-On-Event Multithreading).

The slides shown by Rattner indicated that Silverthorne had a power consumption below 1W and up to 2W and that it was “10x lower power than ULV Dothan”.  The Dothan was the second generation Pentium M product and ULV parts had a TDP (Thermal Design Power) of 5W.  I later got clarification via email that Silverthorne processors can have TDPs as low as 0.6W with lower clock speeds and higher clocked parts will have a 2 watt TDP.  I spoke with analyst David Kanter of Real World Technologies and he explained that 0.6W which doesn’t factor in chipset power consumption might be too high for smart phone applications.  However, its immediate successor in the Moorestown platform which may launch late 2008 may solve that problem with its SoC (System on Chip) design.

Update 3:10PM - There are quite a few inaccurate reports out there on Silverthorne’s power consumption.  They have reported the power consumption of Silverthorne as 0.6W to 2W which is not correct.  0.6W is actually a TDP rating which describes PEAK power consumption.  Actual idle power consumption can dip down to 0.01W for some models and 0.1W for other models.  Intel is not saying too much more right now but it is reasonable to assume that this extremely low power state is designed to maximize battery life in Smart Phones.  Keeping a continuous Skype or SIP application presence in a UMPC or MID device to receive calls is now a possibility. 

The 2 GHz variant of the Silverthorne processor will operate at 1 volt and it will have performance equivalent to a first generation “Banias” Pentium M notebook processors circa 2003.  Rattner confirmed this was for single-threaded performance on a broad range of applications.  This would seem to imply that with multithreaded applications, the performance would be even higher than Banias which lacks Hyper-Threading.

Here are some additional quotes pulled from Rattner’s slides:

  • Deep power down C6
  • Optimized register-file and cache 6T bits cells
  • CMOS mode on quad-pumped FSB IO
  • Split IO power supply

Here are some additional email responses:

  • 0.6W to 2W measured TDP power on real world applications – over the lifetime of the processor/architecture
  • Can achieve 2GHz core frequencies at 1.0V
  • Will support features such as Digital Media Boost (SSE3), Intel Virtualization technology, Intel 64 Architecture support, HT

February 1st, 2008

San Clemente chipset gives HP lead on energy efficiency

Posted by George Ou @ 4:01 am

Categories: AMD, Energy efficiency - green, Hardware, Intel, News, Processors, Servers

Tags: Hewlett-Packard Co., Power Supply, Advanced Micro Devices Inc., Watt, Intel Corp., Chipsets, Semiconductors, Processors, Hardware, Components

The January 30th 2008 batch of test results are in for SPECpower_ssj2008 energy efficiency benchmark and it looks like Hewlett Packard has claimed the energy efficiency lead with their newest low-cost 2U HP Proliant DL180 G5 server.  The secret to their success appears to lie in the selection of the Intel 5100 series “San Clemente” chipset.  While the detailed SPECpower disclosure doesn’t actually mention the chipset anywhere, the power characteristics, the six memory DIMMs, and the ICH-9 storage is a dead giveaway.

To see where the modern servers stand on power consumption, I’ve plotted out some ESTIMATED charts to compare the results.  Since the AMD system from Colfax International has 8 registered DDR2-667 DIMMs and the HP San Clemente system has 6 registered DDR2-667 DIMMs, I’ve had to adjust them both down to 4 DIMMs to do a fair comparison with the other Intel systems which used 4 DIMMs.  To do this I had to use an approximation based on known measurements for memory power consumption and I subtracted 1.875 watts to 3.75 watts for each registered DDR2-667 DIMM on a linear sliding scale based on load percentage.  That means I subtracted 7.5 watts for the AMD system at idle and 15 watts for the AMD system at peak power.  For the HP San Clemente system I subtracted 3.75 watts at idle and 7.5 watts at peak loads.

Since it was shocking that a dual-processor eight-core 3 GHz Intel system was drawing lower power than a dual-processor four-core 2.4 GHz AMD system, I thought something might be a little off.  I realized that Colfax had used a pair of redundant 700 watt power supplies whereas the HP San Clemente system uses a single 750 watt power supply which means the power supply for the AMD system is relatively inefficient.  At this point I had to make a reasonable guess at PSU (Power Supply Unit) efficiency and I guessed that the HP single power supply had to be around 80% efficient whereas the Colfax dual-PSU would be around 70%.  Therefore I estimated the power consumption of the AMD system had it used an 80% efficiency power supply instead of a 70% efficient power supply.

Unfortunately this is a rough educated guess so the accuracy is dropping quickly but I wanted to take a reasonable shot at it to level the playing field on PSU efficiency.  Companies in the future when making SPECpower submissions should avoid using dual power supplies and stick with 4-DIMM configurations so that we can get apple to apple comparisons and measurements.  For now the following estimated power consumption graph is what I came up with.

The thing that really sticks out is the fact that the Intel 3 GHz 45nm E5450 processor system uses less power most of the time than the special low-voltage variant of the Intel 2 GHz 65nm L5335 processor.  This shows how drastic an improvement Intel made using HKMG (High-K Metal Gate) materials and a shrink to the 45nm process.

The DIMM and PSU adjusted power consumption for the AMD Opteron 2216HE 4-core 2.4 GHz system has dropped significantly by more than 32 watts at the peak but it’s still more power hungry than the Intel 8-core 3 GHz E5450 at less than 80% load.  Despite the fact that AMD takes a deeper clock speed dive down to 1.0 GHz at idle while Intel only dives down to 2 GHz, Intel’s C1E state seems to dominate the power savings.

This can also tell us something about the “Barcelona” quad-core “HE” (High Efficiency) 1.9 GHz system because it has a TDP of 79 watts which is 11 watts higher than the 2216HE under maximum load per CPU.  Realistically the difference will be smaller than 11 watts per CPU and probably more like an 8 or 9 watts difference so an AMD 2347HE 1.9 GHz dual-processor 8-core version would probably consume 16 more watts.  That would likely put the AMD 8-core 2347HE 1.9 GHz server at higher power consumption level than the 8-core 3 GHz E5450 Intel server running on a San Clemente chipset.  That seems counter intuitive since Intel’s TDP rating for its 45nm 3 GHz processor is 80 watt TDP and that doesn’t even count the memory controller on the motherboard.

When looking at the difference between the HP San Clemente chipset based server and the HP 5000 series chipset based server, there is roughly a 32 to 40 watt difference even though the two CPUs are identical.  Most of that difference is due to an extra 6 to 7 watts per FBDIMM and the remaining power delta is mostly due to the newer chipset on the motherboard.  Had both of these servers had 8 DIMMs, the power gap would have been approximately 26 watts wider because of the extra power consumed by the FBDIMMs on the Intel 5000 series chipset.

Next I plot out the power-adjusted ESTIMATED energy efficiency numbers.  I adjusted all the systems to four DIMMs and gave the AMD Opteron system a boost in power supply efficiency from an assumed 70% efficiency to 80% efficiency.  Again this is a rough guess but it’s reasonable considering the fact that Colfax used a dual 700W power supply instead of a single 750W power supply.  If Colfax International is reading this blog then I would suggest to them not to shortchange their own results in the future and use 4 DIMMs and a single PSU like everyone else.

Hopefully the next batch of results will give us some performance numbers on faster single-socket systems using the Bigby chipset and a 45nm processor so we can see how high on the efficiency scale those servers will go.

January 28th, 2008

Mac Pro is now the cheapest high-end workstation

Posted by George Ou @ 12:40 am

Categories: Apple, Build it yourself, Hardware, Intel, Processors, Workstations

Tags: Hard Drive, Video Card, Apple Macintosh, Memory, Apple Inc., CPU, Bottom Line, Mac Pro, Desktops, Workstations

Earlier this month I wrote “Build a Mac Pro equivalent workstation for 1/3 the cost” and the pricing didn’t look good for the Mac.  Now that the new Mac Pro with updated specifications and a much lower price has come out, I figured it’s time to do an updated comparison.  But during my research I came to a stunning conclusion: it’s the cheapest name brand dual-processor workstation on the market IF you know how to buy third party memory and storage.  It’s not only cheaper than the slower $3817 Dell workstation I looked at earlier this month, but I can’t even build a cheaper generic PC clone unless I switched to a lower-end CPU.  If you’re in the market for a high-performance Apple workstation, keep reading to learn how to get the best deal.

The new Mac Pro uses Intel’s latest 5400 series “Stoakley” platform with the “Seaburg” chipset.  For the CPU, it uses the 1600 MHz FSB version of the 5400 series CPUs which have clock speeds of 2.8, 3.0, and 3.2 GHz.  The graphics card has gone from AMD/ATI 1900XT to an NVIDIA 8800GT.  The memory was upgraded from Fully Buffered DDR2-667 to Fully Buffered DDR2-800.

As configured in the screen shot to the left, the stripped down system is $2999 with relatively few memory DIMMs and two minimum hard drives.  Since they’re only going to reduce the price by $500 if you only buy one processor and the fact that it would cost you $900 to replace that chip, it’s not worth buying one CPU from Apple.  The memory and hard drives were still too expensive so I left them on the default settings but you will most likely have to take them out and replace them.  The video card will also cost more to replace with a third party brand so it isn’t worth skipping either.  It’s also possible that a third party 8800GT might not work so I wouldn’t even bother trying.

Now once you buy this system, you’re going to need to buy some fully buffered DDR2-800 memory which is still very hard to find at this time.  I found some for $245 (vendor claims Mac Pro tested) which is way more money expensive than other generic memory but it’s way better than the $1500 Apple is asking for.  A few other people in talkback posted this link for two 2GB DDR2-800 at $220.  The price will probably drop $40 in coming months as these get more common but I think the price isn’t too bad at this point.  You will need to buy two of these for $440 if you want the system to run with the max four-channel memory but be sure to populate each DIMM in a separate channel to get the maximum benefit.  Note that CPU-Z for Windows will let you confirm how many channels you’re running though I’m not sure about a Mac equivalent applet but I’ll update if I find out.

The hard drives can be replaced with any 3.5″ SATA hard drive and you can usually buy two 500 GB Seagate hard drives for $240 and put them in a RAID-1 configuration.  This does mean that you’ll either need to leave your OS on the single 320 GB hard drive or you’ll need to manually move the OS to the 500 GB RAID-1 volume which makes the OS boot faster.

Now you have a 2.8 GHz Mac Pro for less than $3800 with all the trimmings which makes it the cheapest high-end workstation on the market.  It’s still possible to get a great PC 2.33 GHz dual-processor workstation for less than $2400 but the high-end belongs to Apple.  However, it’s not really practical to build a lower-end Mac Pro since I’ve got it stripped down to the bone so Apple still has plenty of profit to make even if you don’t buy their outrageous components.  The bottom line is that Mac users can get a much better deal on Mac Pros than at the beginning of this month.

Update 10:30AM
If you’re installing Boot Camp and Windows, do the installation after you set up the RAID-1 volume.  You will need these drivers from Intel’s website for Windows XP, Vista, and Server.  If you don’t want to spend $3700 and you can live with a perfectly good dual-processor 2.33 GHz workstation for $2370 which has the same 5400 series chipset.  Apple seems to have figured out the perfect strategy to keep a high margin yet keep you from building a cheaper clone with exact specifications.

January 23rd, 2008

Analysis: Server Side Java energy efficiency versus load

Posted by George Ou @ 5:25 am

Categories: AMD, Energy efficiency - green, Hardware, Intel, Processors, Servers

Tags: Performance, Java, Workload, Efficiency, Advanced Micro Devices Inc., CPU, Analysis, Intel Corp., Energy Efficiency, SPEC

With the arrival of the latest standardized energy efficiency benchmark from SPEC, we have a good way to measure server efficiency.  In light of the recent controversy over flawed energy efficiency studies that have unfortunately been touted by so many in the press instead of SPEC, I thought I’d offer some more in-depth analysis on energy efficiency.

The new SPECpower_ssj2008 benchmark gives us a standardized way of measuring energy efficiency for Server Side Java.  SPECpower_ssj2008 gives us efficiency data at varying workloads going from 0% to 100% at increments of 10%.  Then it provides us with a Performance to Power Ratio curve along with an average efficiency of those 11 workload measurements.  The two graphs below are compiled from the SPEC database.  It represents the fastest Intel quad-core system (below left) versus the only AMD CPU submitted to the SPECpower_ssj2008 database to date which is a special energy-efficient Opteron 2216HE (below right).

The two graphs above show more than a 3 to 1 advantage for the fastest Intel system when we look at it in terms of percent workload.  This is a perfectly valid way of analyzing the data, but the tradeoff is that you’re not seeing the efficiency of each processor at absolute workloads which might be valuable if you need a system with lighter workloads.  So to offer an alternative method of interpreting the efficiency data, I plotted out the following Efficiency versus CPU capacity graph with published data from SPEC (and some MS Excel help from analyst David Kanter).

  • DP = Dual Processor
  • UP = Single Processor (Uni-Processor)
  • QC = Quad Core
  • DC = Dual Core
  • FB = Fully Buffered
  • “Operations per joule” is identical to ssj_ops/watt unit used by SPEC.
  • “Operations per second” refers to Server Side Java performance.

The blue curve represents the Intel E5450 server shown in the SPEC “Performance to Power” chart above left while the cyan curve represents the AMD 2216HE system.  You’ll notice that the curves are somewhat close together at the lower workloads which means the AMD system is almost as efficient as Intel at lighter workloads.  But at peak performance levels, Intel is three times faster than the AMD 2216HE system and more then three times the energy efficiency.  So if you had to buy three of the AMD 2216HE systems to get the same Server Side Java capacity as the Intel E5450, it would cost you three times the power.

You’ll also notice the pink curve spiking upwards in efficiency just shy of the absolute peak efficiency level of Intel’s latest 45nm E5450 3.0 GHz quad-core CPU.  This single-socket single-processor 2.4 GHz XEON X3220 Intel server is by far the most efficient system at lighter workloads.  Had a newer single-socket CPU like the 45nm QX9650 3.0 GHz 45nm quad-core processor been used, the efficiency curve would probably fly off this chart.  Intel’s 5100 series “San Clemente” chipset will  also get much better efficiency than anything on this graph because it uses lower power registered DDR2-667 memory like AMD.

<Next page - How to spot a flawed CPU energy efficient study>

January 22nd, 2008

The polycarbonate all-in-one 22" LCD PC

Posted by George Ou @ 1:47 am

Categories: Build it yourself, Desktop, Energy efficiency - green, Fun Stuff, Hardware, Intel, Processors

Tags: PC, Chassis, Motherboard, CPU, LCD, Computer, Productivity, Processors, Semiconductors, Hardware

The last time I built a wooden all-in-one 19″ LCD PC, my family wanted it in the kitchen and my mother wanted it in hers. To keep everyone happy, I built my mother another one (pictured above and below) out of 3/16th inch jet-black polycarbonate which makes the chassis look like the material from a grand piano. The result was something that was so glossy that I can probably shave in it, but I’m almost afraid to touch it and get finger prints all over it. Needless to say, she is very pleased with her new space saving computer. [See photo gallery.]

Cutting this material was fairly simple with wood-cutting and drilling tools. Just be careful to slow down on the table saw so you don’t chip the polycarbonate. I had initially avoided putting in vent holes in the back but the CPU fan and the PSU fan dynamically ramped up in RPM because of the increasing temperature and caused some noise. Once the 4 holes were put in the back, the CPU fan stayed at lower RPM and remained fairly silent even if I stress loaded the CPUs.

This time I mounted the on/off switch up top along with two USB ports which makes it easy to access and comes in handy for the webcam. I just wished I had a webcam that did away with the cable and just had a down-facing USB port so I can just plug it in right on top of the case. The other USB port is convenient for plugging other devices such as USB memory sticks or other devices I want sitting on top of the chassis.

As usual with these slim custom chassis, I used a slim 1.75″ 1U Sparkle SPI220LE 80 Plus 220 watt power supply. The idle power consumption on this computer is 43 watts and 63 watt under peak CPU loads generated by WPrime. The motherboard is an ECS 945GCT-M which came bundled with an Intel Celeron 430 CPU (Conroe-L 1.8 GHz single-core) I got at Fry’s for $70. I put in an Intel Core 2 Duo E2140 dual-core 1.6 instead and kept the lower-profile CPU fan which came with the Celeron 430. That lower profile fan came in real handy since it fit inside my 3″ thick chassis which is even less space inside because of the thickness of the walls. This chassis has plenty of room for additional devices such as a slim optical slot-loaded drive.

<next page>

January 16th, 2008

Why DIDN'T the MacBook Air get the new 45nm CPU?

Posted by George Ou @ 3:36 am

Categories: Apple, Energy efficiency - green, Hardware, Intel, News, Processors

Tags: Apple MacBook, Apple Inc., CPU, Intel Corp., Notebooks, Processors, Hardware, Notebooks & Tablets, Semiconductors, Components

In Focus » See more posts on: Macworld

Intel launched their brand new 45nm mobile dual-core processors last week with 60% smaller packaging size.  Yesterday Apple announced their Über-sleek MacBook Air ultra-slim notebook which also uses a specially designed Intel dual-core CPU with 60% smaller packaging.  Naturally I assumed the new MacBook Air uses Intel’s latest Penryn-class 45nm technology with low leakage hafnium metal gates and I called Intel for confirmation of this “special” processor.  I thought to myself: What’s so special about it if every PC vendor can use the same shrunken CPU?

To my surprise, Apple didn’t use the newest 45nm mobile processor with 107mm^2 die size; they really did use a “one-off” “Merom” 65nm 143mm^2 die designed-just-for-Apple CPU from Intel.  Intel specially designed a larger 65nm core with a specially designed package that’s 60% smaller.  This means instead of using the latest 45nm processors that are faster and more energy efficient and are already that small without any special packaging, Apple got a “special” 65nm chip.

This begs the question why Intel doesn’t make its new 45nm packaging even smaller than the current 60% reduction in size if it can reduce its packaging by 60% on 65nm technology.  It also begs the question why Apple had to go to the trouble of a tailor made 65nm part when the 45nm part launched 3 weeks before the launch of the MacBook Air.  Several other PC makers were already showing off their 45nm based notebooks last week at CES.

I spoke to a few people about this and asked for some theories and we came to a somewhat reasonable guess so I’ll offer these up as some possible reasons.  For a product as specialized at exotic as the MacBook Air, the design would have needed to start some time ago.  When that design started, it may not have been a certainty if 45nm Mobile Penryn would be ready to ship with MacWorld and there may not have been working samples to start the design process.

Despite the fact that other PC makers have 45nm based notebooks ready to launch, none of them are this sleek.  So ultimately it doesn’t really change the appeal of the MacBook Air and it will be the thinnest notebook on the market.  In 20/20 hindsight perhaps it would have been better if the MacBook Air had shipped with a 45nm CPU and maybe we’ll see a quick refresh from Apple to the new processor since the size is obviously not a problem.  It’s just that “special” in this case isn’t a flattering thing when referencing the older CPU used in the MacBook Air, but the MacBook Air is still every bit special in a flattering way.

January 15th, 2008

Beware of flawed CPU efficiency study

Posted by George Ou @ 3:47 am

Categories: AMD, Energy efficiency - green, Hardware, Intel, News, Servers

Tags: Quad-core, Advanced Micro Devices Inc., CPU, Intel Corp., Processors, Semiconductors, Hardware, Components, George Ou

Update 2/22/2008 - I originally used the word “rigged” to describe Neal Nelson’s study. My reasoning for using the word “rigged” was due to the fact that the test platforms used in Nelson’s study painted an inaccurate picture. Nelson’s study omitted two generations of Intel products while including pre-shippings products from AMD. Since I cannot know for a fact whether the test subject selection was intentional or merely coincidental, I changed the word “rigged” to “flawed”. Other than this change, I stand by my analysis here.

What if we held a football game involving the Patriots and any other NFL team where we set up a Patriots handicap that prohibited Tom Brady and the rest of his starting lineup from playing? What if the result was a loss for the Patriots and we splash the headline across the sports newswire that the Patriots just lost a football game? Would you think this was ethical behavior? Well that’s precisely what happened yesterday when the Neal Nelson report titled “AMD beats Intel in quad-core server power efficiency” spread across the newswire and got repeated as fact.

This is a classic case where the measurements are most likely accurate, but what’s being measured isn’tNeal Nelson and Associates is a consulting firm that has made it a habit to put out these handicapped reports on processor efficiency. Last year they excluded Intel’s quad-core lineup when AMD didn’t have quad-core processors and declared AMD the winner and got lots of news coverage, now they’re comparing Intel chips released in Q4 2006 to AMD technology that may not be available to the general public until Q2-2008 and the press seems to be falling for it all over again.

Nelson compared AMD’s Opteron 2350 2.0 GHz quad-core processor (may not ship again until Q2-2008 when the TLB bug hopefully gets fixed) to Intel’s older 65nm “Clovertown” E5335 and E5345 processor which were released in Q4 2006. These weren’t even the newest 65nm G Stepping Clovertown processors from mid-2007 with lower power consumption; these were the older stepping released in 2006. But Intel launched their latest 45nm “Harpertown” processors in November of 2007 and these chips were excluded from this “study” on AMD versus Intel energy efficiency. This is a classic case where the measurements are most likely accurate, but what’s being measured isn’t. This is a critical omission because the 45nm chips from Intel made significant improvements in performance and energy efficiency which has a double impact on performance per watt.

Nelson basically took a product from AMD that hasn’t even sorted out the bugs yet and can’t be purchased yet, then compared it to Intel’s 2006 technology while excluding two newer generations of Intel technology that are available in quantity, and he declares AMD the “winner” on energy efficiency. Then in an ultimate twist of irony, Nelson has the gall to question the methodology of the latest SPEC power efficiency standard SPECpower_ssj2008 when his own tests are outright deceptive. But in reality, SPEC doesn’t go out and declare winners or losers for cheap headlines or overstate the importance of their data; they merely present data with full vendor disclosures and provide valuable data points to the public.

When I did my in-depth review of SPECpower_ssj2008, I tempered the results for AMD despite the fact that the early SPECpower_ssj2008 results showed complete domination by Intel over AMD. I stated that the results would have been more competitive for AMD (at least at comparable clock speeds) if a web server version of SPECpower was used and when AMD quad-core Opteron gets its bugs sorted out. I still stand by that assessment based on the fact that AMD does well on a clock-for-clock basis when looking at SPECweb_2005 performance. However, Intel still commands the clock speed advantage which makes them the performance leader but at least AMD can be competitive on web serving duties at the lower clock speeds if they can fix their bug and launch their quad-core parts.

So who should the IT manager believe when it comes to performance per watt? Ideally you run your own tests on your own applications and draw your own conclusions but that may not be an option for everyone. If running your own tests isn’t feasible, I would recommend finding publicly acknowledged reputable benchmarks like those from SPEC or TPC and try to find the benchmark that most closely resembles your workload. While that isn’t perfect, it’s the closest thing to commissioning your own tests. But what you should not do is rely on consulting firms that have of a track record for fixing the game.

January 3rd, 2008

Build a Mac Pro equivalent workstation for 1/3 the cost

Posted by George Ou @ 5:04 am

Categories: Apple, Build it yourself, Energy efficiency - green, Fun Stuff, Hardware, Intel, Processors, Workstations

Tags: Processor, Apple Macintosh, Workstation, Wisdom, Dual Processor, Intel Corp., Mac Pro, Serial ATA, Chipsets, Workstations

Conventional wisdom tells us that a digital content creation and CAD professional had to fork out $6000 to $10,000 dollars for a high-end 8-core dual-processor workstation, but this is Real World IT where I say screw conventional wisdom.  I’ve put on my mad scientist hat again and brewed something up for $2311 with equal or better performance than a $6803 Mac Pro (as configured in Apple screen cap to the left).  Now granted you can’t run Mac OS X so that might be a show stopper for a Mac user, but there are plenty of Windows users who want something that will run just as fast.  If that’s you, then keep reading!

The Mac Pro is essentially based on an Intel 5000 series dual-processor chipset.  At present time, it still only comes with 65nm “Clovertown” processors maxing out at 3.0 GHz and not the recently launched 45nm “Harpertown” processors and newer motherboard that use the Intel 5100 series “San Clemente” chipset.  As I showed in my quad-core CPU comparisons, the newer 45nm processors costing $300 can rival $1200 65nm processors.  Furthermore, the 5100 series chipset supports cheaper and more energy efficient registered DDR2 memory instead of the power-hungry FBDIMMs (fully buffered DDR2 memory) used in the Intel 5000 series motherboards.

My home-brew 8-core solution costs about a third of the price with performance equal or better than the fastest Mac Pro you can buy on the market.  But when it comes to SSE4 optimized video encoding which nearly every video encoding software package is going to support, you can expect a massive increase in performance over the 65nm “Clovertown” quad-core processors.  Other improvements in my solution is a 5-drive hot-swap SATA back plane which allows you to easily swap out up to five hard drives.  The video card I used is an NVIDIA Quadro NVS290 designed specifically for the workstation market and it is also used in Sun’s single processor workstation.

Apple on the other hand uses the out-dated ATI Radeon X1900 XT which is actually a desktop gaming graphics card and not a workstation card.  Below is the exact configuration and pricing for this system.  I also threw in a cordless Logitech EX110 keyboard and optical mouse.  Since Apple includes free shipping, my quoted prices (as usual) includes the cost of shipping.  I also rounded to the nearest dollar and I do not include the effect of rebates in the quoted prices though I mention one rebate in the part description.  I got these prices by roaming the search engines to find reasonable prices mostly from places that I have personally shopped before.

Updated 5:45PM - All Windows drivers for the Intel 5100 series “San Clemente” chipset have now been confirmed and can be downloaded here so both systems are confirmed to operate any x86 or x64 version of Windows XP, 2003, Vista.  I have also verified XP and Vista x86/x64 driver support for all the other components.

Note that the use of FBDIMMs on the 5400 series platform adds about 7 watts of power consumption per DIMM, but the 5400 series ”Seaburg” chipset has the added benefit of a 50% larger snoop filter and official DDR2-800 support so it’s a higher end chipset.  While the 5400 series chipset supports up to 16 FBDIMMs, the 5400 motherboard listed below has 4 DIMM slots whereas the 5100 series motherboard listed below has 8 DIMM slots.  You can get higher memory capacity 5400 series motherboards but they cost a little more so it a toss up which chipset you should use.  You can get a Supermicro X7DWN+B for example which has dual gigabit LAN and 16 FBDIMM slots for an extra $150 over the price of the Tyan S5392ANR.

High-end 8-core 2P Workstation (5400 series “Seaburg” version):

Part Price
Tyan TEMPEST I5400XL (S5392ANR) Intel 5400 series “Seaburg” 408
8 GB fully buffered DDR2-667 ECC memory (2GB x 4) 340
Two Intel E5410 quad-core “Harpertown” 45nm 2.33 GHz CPUs 616
Seasonic 650W 88% efficiency “80 Plus” power supply 160
Cooler Master Stacker ATX chassis Cosmos EATX (updated) 172
NVIDIA Quadro NVS290 PCI-Express 256MB 120
Sound Blaster Audigy 7.1 36
AMS 5-drive SATA hot-swap backplane (model DS-3151SSBK) 102
Two 500GB 7200RPM SATA hard drives 200
18x DVD burner with SATA interface 36
Logitech EX110 wireless optical mouse and keyboard 35
Vista Business x64 edition OEM (dual-processor support) 145
   
Total (including shipping but not tax) $2368

High-end 8-core 2P workstation (5100 series “San Clemente” version):

Part Price
5100 series “San Clemente” dual-processor motherboard 381
8 GB Registered DDR2-667 ECC memory (4 x 2GB) (4 slots open) 310
Two Intel E5410 quad-core “Harpertown” 45nm 2.33 GHz CPUs 616
Seasonic 650W 88% efficiency “80 Plus” power supply 160
Cooler Master Stacker ATX chassis (additional $60 rebate) 170
NVIDIA Quadro NVS290 PCI-Express 256MB 120
Sound Blaster Audigy 7.1 36
AMS 5-drive SATA hot-swap backplane (model DS-3151SSBK) 102
Two 500GB 7200RPM SATA hard drives 200
18x DVD burner with SATA interface 36
Logitech EX110 wireless optical mouse and keyboard 35
Vista Business x64 edition OEM (dual-processor support) ??? 145
   
Total (including shipping but not tax) $2311

If you don’t know how to build a PC or you’re rusty, here’s a step-by-step guide.  You can also have a local PC shop assemble the whole thing for around $100 or so and some will even install the OS for a little more money.  Other shops may just sell you all the parts for a minimal markup with no charge on assembly if you take this parts list to them.

As for which LCD display to buy, make sure you buy something that isn’t a typical TN type panel with lousy viewing angles and lousy 18-bit color.  Dell’s $700 24″ 2407WFP-HC is highly rated and it uses a high color PVA type panel with true wide viewing angles that don’t drastically drop in contrast ratio when viewed off center.  The inexpensive $300 24″ Soyo (available at Office Max) is actually an MVA type panel with true 24-bit color and wide viewing angles.  If you don’t need a super high color gamut, picking up two of the 24″ Soyos for dual-screens might be a great solution.  For comparison purposes, the Apple iMac 20″ uses the lousy TN type display while the 24″ iMac uses the superior PVA, MVA, or IPS TFT technology.

Update 5:45AM - What about Dell workstations?

Larry Dignan asked me what about Dell solutions for the workstation market.  That’s a great question and I just looked it up on Dell’s website.  I configured a Dell Precision T7400 with identical CPU and GPU configuration but with the older Intel 5000 series chipset [Update 6:40AM - reader s_souche pointed out that the T7400 is actually based on the newer 5400 series "Seaburg" chipset which also uses FBDIMMs and has the highest memory capacity].  One problem was that it only allowed me to configure half the memory using 4 1GB FBDIMMs.  This makes me wonder if there are only four DIMM slots in the entire system which would be rather unusual for a 5400 series motherboard.

It was also crazy that they charge an extra $350 to upgrade to a 500 GB SATA hard drive when those drives are barely worth $100 to begin with.  The total price for the RAM deficient system was $3817.  You will have to go out and buy your own 2GB FBDIMMs if you want to get up to 8GB RAM.  That’s not as bad as the Mac Pro configuration above but it’s still far worse than my home brew.

January 2nd, 2008

A comparison of quad-core server CPUs

Posted by George Ou @ 3:52 pm

Categories: AMD, Energy efficiency - green, Hardware, Intel, Processors, Servers, Workstations

Tags: Processor, Quad-core, Memory Bandwidth, Server, Dual Processor, Advanced Micro Devices Inc., CPU, Intel Corp., Chip, TLB

For anyone looking to buy a workstation or server CPU, quad-core CPUs have become mainstream. Therefore it’s important to know what you’re getting for the money so I’ve compiled a chart with general purpose computing performance using the SPEC CPU database with the highest scores as of December 28, 2007. I included single and dual processor solutions to help you decide whether you want to go single CPU socket or dual socket motherboard. You and also read more about energy efficiency on server processors here.

Note: This information is also available as a PDF from the TechRepublic Downloads Library.

All Intel dual processor models starting with the 54xx are the latest “Harpertown” 45nm CPUs launched in November 2007. All Intel dual processor models starting with 53xx are the 65nm “Clovertown” quad-cores Intel launched in late 2006 and mid 2007. In the single processor space, only the QX9650 “Yorkfield” processor uses Intel’s latest 45nm process and everything else uses the 65nm process. The Q6600 and X3220 are essentially identical processors marketed towards desktop and entry level server markets respectively. Since one of the key differentiators on a workstation/server system is the inclusion of error correction memory, one can use any of the desktop CPUs in an ECC capable single processor motherboard.

The two AMD processors are Opteron quad-core CPUs based on 65nm “Barcelona”. The 2.0 GHz Opteron 2350 is delayed due to the TLB bug and the 2.5 GHz Opteron 2360SE won’t come out until the B3 stepping is out which fixes the TLB bug and brings higher clock speeds. There are reports that B3 stepping may be delayed until Q2 of 2008 (tranlated link here) though AMD’s last analyst meeting presentation has a rough timeline of Q1 or Q2.

Note: SPEC CPU is broken down by performance on general purpose integer and scientific memory-bandwidth/floating-point intensive workloads. The general purpose workloads are summarized by a geometric mean score called SPECint and the scientific workloads are summarized by a geometric mean score called SPECfp. The results are further broken down by single-threaded results and multi-threaded results labeled as “rate2006″. Note that a geometric mean is sort of like an average but it punishes the extremes more with a lower score than the average if a particular chip performs very poorly on some workloads. Ideally, one would simply benchmark their own specific application but that’s not always possible so these published numbers from SPEC are very valuable data points.

SPECint includes workloads like Perl, compression, compilers, video compression, and other general purpose workloads. SPECfp includes workloads like bwaves, gamess, gromacs, povray, and a dozen other memory bandwidth and floating point intensive benchmarks. So while it’s important to have a general ideal of how a chip performs in general, discriminating buyers will look inside the detailed disclosure (which I link to) and look at the application that is most similar to their own. So while a chip from AMD might have a lower overall score on SPECfp_rate2006, there are individual workloads within SPECfp that overwhelmingly favor AMD’s memory bandwidth advantage. The inverse of this situation where an Intel CPU has a lower overall SPECfp score than an AMD CPU but still win some of the specific workloads can also be true. So in a nutshell, the chip you select should be based on your application requirements.

CPU Model CPU Clock FSB SPECint 2006 SPECint rate2006 SPECfp 2006 SPECfp rate2006
Mainstream dual processor server quad-core CPUs
Intel X5482 3.2 1600 26.1 147 22.2 85.2**
Intel E5472 3.0 1600 26.7 143 23.7 88.1
Intel X5460 3.16 1333 27.7 138 23.9 79.2
Intel X5450 3.0 1333 26.5 134 23.2 77.3
Intel X5365 3.0 1333 24.5 117 21.4 67.7
Intel E5410 2.33 1333 21.6 115 19.9 69.4
Intel E5405 2.0 1333 19.2 104 18.2 64.7
Intel E5335 2.0 1333 18.1 92.2 16.9 58.4
AMD 2350 2.0 NA   88.8 *   77.9 *
AMD 2360SE 2.5 NA   102 *   86.3 *
Entry level single processor workstation/server quad-core CPUs
Intel QX9650 3.0 1333 25.5 76.7 22.3 52.0
Intel QX6850 3.0 1333 23.6 69.1 21.2 49.4
Intel X3220 2.4 1066 15.9 59.0 15.3 42.5
Intel Q6600 2.4 1066 18.5   16.0  

* These results were invalidated last month because of lack of availability. Furthermore, the TLB bug patch performance penalty has not been factored in to these results. Assuming AMD fixes the bug in Stepping B3 and solves the manufacturing challenges in mid 2008 to deliver 2.5 GHz parts, scores similar to these invalidated numbers can be resubmitted. So while these numbers are officially invalidated, they were invalidated for lack of availability and not for inaccuracy so I left these numbers in for comparison purposes.

** Results for the X5482 3.2 GHz systems seem odd since they’re worse than the E5472 3 GHz results. Intel gave an unofficial estimate at IDF2007 of 89.8 for SPECfp_rate2006 so we might see this number get updated as time goes by. Note that the SPEC CPU base scores for the X5482 were higher than the E5472 so that seems to fall more in line with expectation.

These results indicate a significant improvement with Intel’s latest 45nm technology in multi-threaded applications. Comparing 3 GHz Harpertown with 3 GHz Clovertown, improvements for single-threaded applications were noticeable in the 8% range and that is mostly attributable to architectural enhancements in the chip’s execution engine. At 3.0 GHz for multi-threaded applications, we saw a ~14% improvement on both SPECint and SPECfp using the same motherboard chipset and the additional gains are mostly due to the 50% larger CPU cache. But once the new 5400 series “Seaburg” chipset got involved with a 50% larger snoop filter and 20% faster memory bus, the 3.0 GHz scores jumped 22.2% for SPECint and 30.1% for SPECfp.

Considering the fact that the energy efficient 45nm Intel E5410 2.33 GHz chip costs around $300 whereas the 65nm Intel E5345 2.33 GHz chip costs around $600, buyers who are looking for Intel based solutions should immediately switch to 45nm technology. The Intel E5410 even manages to beat the $1200 Intel X5365 3.0 GHz processor on SPECfp_rate2006 and comes awfully close on SPECint_rate2006. So for the general purpose server market, the new E5410 on average seems to be the performance/dollar leader.

HPC (High Performance Computing) customers who have memory bandwidth intensive workloads on the other hand have been purchasing loads of inexpensive AMD Barcelona processors despite the TLB bug. Those memory-bandwidth hungry customers are using custom Linux kernels that work around the TLB bug with minimal impact on performance so they don’t care about the bug or the lower overall SPECfp scores.

December 19th, 2007

Hitting 50W peak on a dual-core desktop computer

Posted by George Ou @ 8:54 pm

Categories: AMD, Build it yourself, Consumer electronics, Desktop, Energy efficiency - green, Fun Stuff, Hardware, Intel, News, Processors

Tags: Desktop, Dual-core, Stock, Power Consumption, Memory, Motherboard, Computer, Watt, Desktop Computer, Processors

The 50W no-compromise dual-core commodity desktop PC is now a reality!I have some great news for the green computing world.  The 50W no-compromise dual-core commodity desktop PC is now a reality!  It all started a few months back when I looked in to the possibility of building a main stream dual-core desktop computer that can drop under 50 watts idle but now I’ve answered that question beyond all expectations.  Using a 220W Sparkle SPI220LE “80 Plus” efficient power supply, an Intel E2140 1.6 GHz dual-core CPU running at lower-than-spec 0.95 volts, and a Gigabyte G33M-DS2R motherboard, the system comes in just under 50 watts at *PEAK* CPU load generated by WPrime running 2 threads.  If I could only find a smaller 100 watt 80 Plus power supply and hit the optimum 50% loading at peak power consumption, then it might be possible to get peak system loads down to around 45 watts.

At idle the system uses 41 watts which is actually one watt higher than my sub-$400 All-in-One LCD PC with an ECS 945GCT-M motherboard and an Intel E2180 2.0 GHz dual-core running at stock speeds and voltage.  It turns out that this G33M-DS2R board with E2140 CPU running at stock speeds and voltage has an idle system power of 46 watts which is 6 watts higher than the ECS board with E2180.  This was surprising to me since the new G33 chipset has a more energy efficient memory controller than the 945 chipset.

Possible explanations are the fact that the G33-based motherboard was running the memory at 400 MHz base clock (DDR2-800 memory) whereas the 945-based motherboard was running the memory at 200 MHz.  One other factor is the fact that the Gigabyte G33M-DS2R Intel G33-based motherboard has a 6-port SATA ICH9R RAID controller along with a few more memory and PCI ports.  This leads me to think that the combination 2x the memory clock and more components translates to an additional 6 watts of power consumption.

The following idle/peak power consumption charts are from data I collected.

* SPI SPI220LE 220W 80+ PSU
** No system fan which saves 1W power

Gigabyte with Intel CPU = G33M-DS2R motherboard
Gigabyte AMD CPU = MA69GM-S2H motherboard
MSI with AMD CPU = K9AGM2-FIH motherboard

December 14th, 2007

SPEC launches standardized energy efficiency benchmark

Posted by George Ou @ 7:50 am

Categories: AMD, Energy efficiency - green, Hardware, Intel, News, Processors, Servers

Tags: Performance, Java, Quad-core, Power Consumption, Server, Advanced Micro Devices Inc., Energy Efficiency, SPECpower_ssj2008, Servers, Processors

SPEC (Standard Performance Evaluation Corporation) launched its first standardized energy efficiency benchmark SPECpower_ssj2008 this week which tackles something that the computer industry has struggled to define in recent years.  With datacenter energy costs spiraling out of control, server customers have struggled to sort out the conflicting messages from technology vendors about who is the energy efficiency leader.  Now the industry has a standardized way to measure the energy efficiency of computer servers.

Even though this first version of SPEC Power only addresses server side Java performance, it is one of the most comprehensive standards for energy efficiency to date giving it instant credibility.  Other energy efficiency metrics like the Green500 list simply takes the theoretical aggregate FLOPS (Floating Point Operations Per Second) of a cluster of computers and divides it by the measured peak power consumption or even peak rated power consumption if measurements aren’t given.  Since FLOPS aren’t really a good real-world measurement of performance to begin with and most people don’t operate their servers at peak loads or run massive clusters, the Green500 list simply isn’t that useful of a metric.

SPECpower_ssj2008 is basically a measure of ssj_ops/watt (server side Java operations per second per watt).  I would personally prefer to call it ssj_opj (server side Java operations per unit of energy in Joules) since “per second per watt” is by definition “per Joule”.  SPECpower_ssj2008 factors in the fact that servers usually aren’t operated at peak capacity and they’re even idle at times.  To factor for idle and peak load power consumption, average power consumption at 0, 10, 20, 30, all the way through 100 percent load capacity are measured and disclosed.  Then server side Java operations per second are divided by the average power consumption in watts at every 10% increment and then all the scores are averaged again to produce the “overall” ssj_ops/watt metric.  The following graph is from the current SPECpower_ssj2008 performance leader as of DEC 12th 2008 and it illustrates how this benchmark works.

 

I spoke to the President of SPEC Walter Bays yesterday about this new power benchmark and my preference for using ssj_opj was one of the topics that came up.  I also asked Bays why there couldn’t also be a SPECint_rate2006/watt or SPECfp_rate2006/watt measurement.  Although Bays couldn’t comment specifically on availability or the existence of future benchmarks, he did explain that the SPEC CPU (SPECint and SPECfp) benchmarks are peak throughput only which would be fairly simple to measure and interesting.  The resulting metric would be a lot more valuable than the FLOPS/watt rating used in the Green500 list since SPEC CPU is much more comprehensive than a simple FLOP measurement.  Bays also explained that SPECweb2005 might be a good candidate but it was a more complex benchmark (due to the multiple systems involved) making too much to tackle for the initial version of SPECpower.

<Next page - First server comparisons for SPECpower_ssj2008>

November 16th, 2007

More images and products from supercomputing 2007

Posted by George Ou @ 6:55 pm

Categories: AMD, Build it yourself, Energy efficiency - green, Hardware, Intel, Networking, Processors, Servers, Storage, Sun, Supercomputing

Tags: Processor, Blade Server, Blade, Sun Microsystems Inc., Server, Power Supply, Motherboard, Advanced Micro Devices Inc., Intel Corp., Supercomputing

The SC07 supercomputing conference was a very interesting show for me this year and it was my first time attending this conference.  Here are some more interesting products that I haven’t covered yet all the way from the very high-end to entry-level HPC computers.

Sun’s 3456-node “petascale” constellation cluster

This poster showed a high-level diagram of how Sun’s constellation system uses a massive centralized InfiniBand switch to connect multiple blade racks.

This massive 3456-port 20 gbps InfiniBand switch from Sun is the size of a wide refrigerator and it provides 3:1 InfiniBand port consolidation.

This is one of the InfiniBand blades that plugs in to 3456-port switch.

This is the SunBlade 6048 modular blade system.  It supports quad-processor quad-socket blade servers using Intel, AMD, and SPARC processors.

<Next page - Verari Systems blade servers>

November 16th, 2007

How Rackable saves power with impeller fans

Posted by George Ou @ 5:14 am

Categories: AMD, Energy efficiency - green, Hardware, Infrastructure, Intel, Servers, Storage, Supercomputing

Tags: Fan, Data Center, Server, Trailer, Data Centers, Storage, Servers, Processors, Hardware, Data Management

Rackable Systems had their Ice Cube modular data center on display at the supercomputing conference this week.  I didn’t get to see it at IDF when it was parked outside so I grabbed a few photos and inspected the data center as it was parked in the corner of the convention center.  Pictured left is the back of the Ice Cube trailer.

The trailer has 1400U of half-depth rack space and can house 11200 CPU cores using dual-processor quad-core servers.

.

.

.

.

What’s unique about the Ice Cube is that its servers doesn’t need any fans because the massive impeller fans create so much of a vacuum that air flows through each of the servers to fill the vacuum.

The fact that the Ice Box can sit outside can often mean you can get free cold air in many parts of the country.  Each server is powered by DC which further saves you power since you don’t need to convert AC into DC multiple times.

Microsoft and others have been contemplating the possibility of just using modular trucks instead of building extremely expensive datacenters.

.

.

.

The picture to the left shows a covered impeller fan.  All the impeller fans in the Ice Cube trailer takes 5,000W combined but it can save you up to 25,000W of tiny fans you no longer need in those 1400 servers.  Not only does that save power, that’s 5600 fans you don’t need to worry about maintaining if they break down.

.

.

.

.

.

.

November 15th, 2007

NASA gets SGI 2048-core Itanium 2 supercomputer

Posted by George Ou @ 5:17 am

Categories: Hardware, Intel, News, Processors, Servers, Supercomputing

Tags: SGI Altix, Processor, NASA, Supercomputer, Intel Itanium, Intel Itanium 2, Silicon Graphics Inc., SPECint_rate2006, SPECfp_rate2006, UNIX

I had a chance to speak with NASA and SGI at the SC07 supercomputing convention in Reno this week where I saw one of the biggest super computers in the world.  Pictured left is a 1024-core version of the Altix 4700 and NASA just bought one with twice as many processors (1024 dual-core Itanium 2 processors) based on the Montecito variant of Intel’s Itanium 2 processor and 4 Terabytes of RAM.

This massive supercomputer is the most powerful single node computer in the world (based on SPECint_rate2006 and SPECfp_rate2006 database) and it has one of the largest single system memory pool in the world.  For some applications that simply can’t be effectively broken down in to smaller tasks that a cluster can handle using smaller nodes because of excessive communications overhead, this is really the only system that can crunch those hard problems.

To give you some idea how powerful this system is, a 256-core version of the SGI Altix 4700 has a SPECfp_rate2006 score of 3507 and a SPECint_rate2006 score of 2970.  The biggest 16-core Intel X7350 2.93 GHz server scores 119 on SPECfp_rate2006 and 214 on SPECint_rate2006.  The biggest 16-core AMD Barcelona server has a SPECfp_rate2006 score of 136 and a SPECint_rate2006 score of 160.  A 16-core IBM Power6 has a SPECfp_rate2006 score of 428 and a SPECint_rate2006 score of 478 though the latest 32-core version probably has double that performance.  But even the Power6 is dwarfed by the 256-core SGI machine let alone what a 2048-core version can do.

Of course there are plenty of jobs that do break down nicely for clusters and plenty of jobs that don’t need that much single-node memory.  That’s why NASA also purchased an Altix “ice” 8200 cluster using 16 of the racks pictured left.  Each one of these racks contains 64 dual-processor Intel XEON x86/x64 servers and 16 of these make a 1024 processor cluster with 4096 XEON CPU cores.

The Altix 8200 rack includes the 20 gbps InfiniBand switches on the sides for the cluster interconnect and the racks can be chained together with InfiniBand.  NASA has for the most part used very large shared memory systems like the Altix 4700 above but they’ve just started buying the clustered systems.

.

.

.

.

.

November 12th, 2007

Intel launches world's first 45nm processors

Posted by George Ou @ 9:00 am

Categories: Energy efficiency - green, Hardware, Intel, News, Processors, Servers, Workstations

Tags: Chipset, Processor, Intel Corp., Series Chipset, Chipsets, Semiconductors, Processors, Hardware, Components, George Ou

Updated 5:00PM - Intel extended its lead in microprocessors today by launching the world’s first 45nm microprocessors.  Along with the new “Penryn” 45nm processors being launched today, Intel is also launching the “Seaburg” chipset designed for the HPC (High Performance Computing) chipset which is timed perfectly with this week’s supercomputing conference.  The Seaburg chipset coupled with the new CPU was codenamed “Stoakley”.  Both the new processor and chipset will officially be called the 5400 series processor and chipset.

The new 5400 series processor is built on a brand new 45nm process using High-K dielectrics and dramatically cuts power consumption.  Its key performance enhancements over the previous generation Intel 65nm processors are higher clock speeds, 50% more level-2 cache, enhanced SSE3, brand new SSE4 instruction set which can double the performance of video encoding, enhanced dividers.  The following is a table of the new server and HPC workstation processors launched today.  Note that the FSB 1600 models are designed for the new 5400 series chipset whereas the FSB 1333 and 1066 models will work on the existing 5000 series “Blackford” chipset which Intel launched back in mid 2006.

Intel Processor Clock L2 Cache FSB TDP Cores
X5482a 3.20 12 1600 150W 4
X5472 3.00 12 1600 120W 4
E5472 3.00 12 1600 80W 4
E5462 2.80 12 1600 80W 4
X5460 3.16 12 1333 120W 4
X5450 3.00 12 1333 120W 4
E5450 3.00 12 1333 80W 4
E5440 2.83 12 1333 80W 4
E5430 2.66 12 1333 80W 4
E5420 2.50 12 1333 80W 4
E5410 2.33 12 1333 80W 4
E5405 2.00 12 1333 80W 4
X5272 3.40 6 1600 80W 2
X5260 3.33 6 1333 80W 2
E5205 1.86 6 1066 65W 2

The new 5400 series chipset supports 128 GBs (16 x 8 GB DIMMs) of Fully Buffered DDR2-800 whereas the older 5000 series chipset only supported Fully Buffered DDR2-667.  The new chipset also features a 50% larger 24 MB snoop filter which allows for more efficient cache indexing.  Another key feature that is interesting to the HPC market is the inclusion of PCI-Express generation 2.  PCI-Express generation 1 was throttling the performance of high-end InfiniBand adapters because of a lack of bandwidth and PCI-Express generation 2 solves these problems.

Another chipset that received very little press and attention is the new “San Clemente” chipset which is part of the “Cranberry Lake” platform.  This new chipset uses registered DDR2-533 or DDR2-667 memory and has a peak Front Side Bus of 1333.  It will have a peak memory capacity of six double-rank DIMMs which realistically means you can put 24 GBs of RAM in the system using six 4GB DIMMs.  San Clemente uses the ICH9R south-bridge for storage which is the same storage controller used in the new Intel 3-series desktop chipsets and it has even better storage performance than the already-fast ICH8R chipset.

The new San Clemente chipset also lacks PCI-Express generation 2.  Despite the shortcomings of San Clemente compared to the high-end Seaburg 5400 series chipset, it has some phenomenal performance/watt characteristics when it’s coupled with the 3 GHz 80W TDP chips and less power hungry un-buffered DDR2 memory.  The tradeoff here is that you only get 3/8th the memory capacity of the Seaburg chipset but it may be enough for most applications.  This makes the Cranberry Lake platform ideal for very high density blade solutions where performance/watt and reasonable power ceilings per rack are paramount.

George Ou is Technical Director of ZDNet. See his full profile and disclosure of his industry affiliations.

SponsoredWhite Papers, Webcasts, and Downloads

Click Here
advertisement

Recent Entries

Top Rated

    advertisement

    Archives

    ZDNet Blogs

    White Papers, Webcasts, and Downloads

    SmartPlanet

    Click Here