As the ARM server market began to emerge in press and powerpoint, it was not hard to separate the hype from reality: it was a lot of hype. Spread by well-meaning advocates trying to change the world and give Intel a run for their money, these myths created unrealistic expectations on whether ARM chips are worthy of server applications, when they will ship, and how hard they will be to use. I applaud the early leaders including APM and AMD for their early efforts on 64-bit products. While they have tried to balance their excitement and the uncertainty of semiconductor development schedules, there are nonetheless a few myths that need clearing up. Here are six common ones: [Read more...]
Calxeda has announced its second generation SoC, the ARM® Cortex™ A15 based EnergyCore™ ECX-2000. This is the industry’s first ARM-based SoC enabled for full OpenStack clouds, Xen and KVM virtualization, and delivers twice the performance of the first generation ARM-based server SoCs. Calxeda will demonstrate the new platform running Ceph object storage and OpenStack at this week’s ARM TechCon conference in Santa Clara, October 29-31. Notably, HP has selected the ECX-2000 for an upcoming Moonshot server in early 2014. Calxeda also added a second 64-bit SoC to its roadmap that is pin-compatible with the ECX-2000, accelerating the availability of production 64-bit Calxeda-based systems in 2014 and protecting customers investments.
While this is big news, there is a far more important story to be told. The new ECX-2000 is just the next step on the journey to a far more efficient datacenter. This journey will fundamentally reshape the datacenter infrastructure into a fleet of compute, storage, networking, and memory resources; the so-called Software-defined Data Center.
Written by Shawn Kaplan, General Manager – Financial Services, TELX
Advances in multi-core computing have allowed far greater compute densities such that nearly all datacenter racks run out of available power far sooner than physical space. Traditional High Performance Computing (HPC) X86 clusters can consume upwards of 400W per rack unit (U), this means that a typical data center rack with a 5KW – 8KW circuit can be maxed out in as little as 1/4 or 1/2 of the available space. Many of today’s forward thinking IT leaders are asking “Why can’t I have both extremely dense computing and better power efficiency?”
IEEE held their annual fest for uber-techies at SuperComputing ’12 this week in Salt Lake City. With over 8000 attendees flocking to the snowy site in spite of the economy and impending fiscal cliff, this event has become a mecca for anyone seeking the next great technology in computing hardware for serious work. In the old days, it was all about (Tera)Flops and Fortran. These days it is about Big Data, hardware acceleration, interconnect fabrics, storage, and green computing. Wandering around in the massive exhibit hall, one could see name badges from companies like eBay, Amazon, Peer One Hosting, and Dreamworks, right alongside the traditional attendees from leading universities, National Labs, and the Departments of Defense and Energy.
So, what’s a little core like ARM doing in a place like this? Its all about the data. “Data Intensive Computing” in HPC is pronounced “Big Data” in the enterprise. And the two communities have another thing in common: both are seeking more energy efficient solutions to large computations challenges. So naturally, they are turning to ARM with great hopes for the future.
Remember how smoothly Apple transitioned from PowerPC chips to X86 back in the mid 2000’s? Customers hardly noticed that all their software “just worked” on a completely different ISA, thanks to some cool software built by “Transitive”, a small UK based company since gobbled up by IBM. Well, emulation doesn’t solve ALL the worlds problems, and critical applications will of course need to go native for maximum performance. But this approach can be very helpful with the CAO, or Computer Aided Other; the ancillary but important applications, tools, and utilities that are so pervasive in a datacenter.
Below is an excerpt from the EE Times article, ARM Gets Weapon in Server Battle Vs. Intel.
Russian engineers are developing software to run x86 programs on ARM-based servers. If successful, the software could help lower one of the biggest barriers ARM SoC makers face getting their chips adopted as alternatives to Intel x86 processors that dominate today’s server market.
Elbrus Technologies has developed emulation software that delivers 40 percent of current x86 performance. The company believes it could reach 80 percent native x86 performance or greater by the end of 2014. Analysts and ARM execs described the code as a significant, but limited option.
A growing list of companies–including Applied Micro, Calxeda, Cavium, Marvell, Nvidia and Samsung-aim to replace Intel CPUs with ARM SoCs that pack more functions and consume less power. One of their biggest hurdles is their chips do not support the wealth of server software that runs on the x86.
The Elbrus emulation code could help lower that barrier. The team will present a paper on its work at the ARM TechCon in Santa Clara, Calif., Oct. 30-Nov. 1.
The team’s software uses 1 Mbyte of memory. “What is more exciting is the fact that the memory footprint will have weak dependence on the number of applications that are being run in emulation mode,” Anatoly Konukhov, a member of the Elbrus team, said in an e-mail exchange.
The team has developed a binary translator that acts as an emulator, and plans to create an optimization process for it.
“Currently, we are creating a binary translator which allows us to run applications,” Konukhov said. “Implementation of an optimization process will start in parallel later this year–we’re expecting both parts be ready in the end of 2014.”
Work on the software started in 2010. Last summer, Elbrus got $1.3 million in funding from the Russian investment fund Skolkovo and MCST, a veteran Russian processor and software developer. MCST also is providing developers for the [Elbrus] project. Emulation is typically used when the new architecture has higher performance than the old one, which is not the case-at least today–moving from the x86 to ARM. “By the time this software is out in 2014 you could see chips using ARM’s V8, 64-bit architecture,” Krewell noted. “That said, you will lose some of the power efficiency of ARM when doing emulation,” Krewell said. “Once you lose 20 or more percent of efficiency, you put ARM on par with an x86,” he added. Emulation “isn’t the ideal approach for all situations,” said Ian Ferguson, director for server systems and ecosystem at ARM. “For example, I expect native apps to be the main solution for Web 2.0 companies that write their own code in high level languages, but in some areas of enterprise servers and embedded computing emulation might be interesting,” he said.
Back in June, Calxeda published web-serving benchmarks that claimed a significant advantage in performance per watt over x86-based servers. Using ApacheBench, a single 5.26 watt Calxeda EnergyCore server delivered 5500 transactions per second, compared to a 102 watt (TDP) Intel E3-1240 that saturated the network at 6950 TPS. About 2 months later, Intel spoke with Timothy Pricket Morgan at The Register to provide their response.
You have to hand it to Intel; they make really fast processors, which are appropriate when maximum compute performance is needed. But Intel’s argument is missing the point, the very reason why Extremely Efficient Servers are a promising trend: by right-sizing the compute, memory, and networking infrastructure to meet real workload requirements, one can save a great deal of money and power. Intel’s response is classic PC-Server era thinking: use a faster CPU, and then feed it like a force-fed goose being prepped for foie gras. In this case they added a 10G ethernet port to try to close the gap. But if 5000 transactions per second is all your website needs, or you use load balancing to handle the peak loads above normal usage, Calxeda is dramatically more efficient. That is the point.
It is a bit surprising Intel went to these lengths when Intel’s own math shows that Calxeda maintains a 4-5X performance/watt advantage versus the solution most websites would use. Apparently not satisfied, Intel then upped the ante and added an expensive 10 Gb network infrastructure to keep their uber-fast processor busy. With this configuration, Calxeda is still some 30% more efficient than the significantly more expensive* 10Gb Ivybridge solution. But small-medium web sites rarely use or need a 10Gb ethernet port; a 1Gb interface is usually sufficient for typical demand. Moreover, Intel’s proposed alternative would require two 10Gb top of rack switch (TORS) ports in addition to the 3 NICs (2 for data, 1 for management). Those TORS ports alone could add 10-15 watts per server for the 10Gb solution that were not included in Intel’s math. But hey, it won the benchmark (well, almost)!
Calxeda is focused on providing energy-efficient solutions for real-world problems and we believe that bigger and faster is not always better. Leaner and cleaner can be less expensive and far less power hungry, lowering costs for real-world workloads which can be highly variable. Which is more representative of your real real-world environment? You be the judge.
* Based on comparing the servers w/o disks to isolate the server-power, and adding 1 watt to each 5.26 watt Calxeda node to estimate wall power, assuming a modest 24 nodes in a chassis share the power supply and fans. Note that each Intel server equipped as Intel suggests would require a PCI extension with 10 Gb NICs, and switch ports; 2 for data and 1 for management. These are costly additions ($700 per 2 ports, plus the required 10Gb TORS ports) to the IvyBridge server, and of course consume even more power. We are still optimizing our platform and Calxeda will publish a slew of benchmarks and wall-power measurements in the coming weeks.
It’s the middle of June, which means we’re smack in the middle of tradeshow and conference season for the IT industry. We were at Computex in Taipei two weeks ago, and this week we’re participating in International Supercomputing in Hamburg, and GigaOM’s Structure conference in San Francisco. In fact, our CEO, Barry Evans, is on a panel to discuss fabric technologies and their role in the evolution of datacenters. Should be a good one!
In spite of the hectic season, it hasn’t stopped us from moving forward with what everyone is really waiting for: benchmarks! Well, I’m happy to be able to share some preliminary results of both performance and power consumption for those of you looking for more efficient web servers.
Dell announced today that they are working with TACC (Texas Advanced Computing Center) to provide access to ARM-based servers for developers and researchers. Here’s one of many articles: http://www.v3.co.uk/v3-uk/news/2180570/dell-targets-datacentres-arm-copper-servers . This is an important announcement, since Dell is a big player in the markets best suited for ARM such as Internet web properties, Big Data, Cloud Service Providers, and HPC outfits. As such, this further validates the market demand for ARM-based servers, and helps accelerate the growth of the required ecosystem.
The observant reader will notice that Dell is using the Marvell Armada XP SOC, because, as Dell pointed out, “the Marvell parts are already available in sufficient quantities”. But Dell positioned themselves as “agnostic”, and confirmed in interviews that they are also working with Calxeda.
As the ARM server market continues to take shape, the one constant you can expect is choice. Thats the beauty of the ARM business model! Congratulations Dell! Welcome to the party!