Dell Enters the ARM Race for Servers

Dell announced today that they are working with TACC (Texas Advanced Computing Center) to provide access to ARM-based servers for developers and researchers. Here’s one of many articles: http://www.v3.co.uk/v3-uk/news/2180570/dell-targets-datacentres-arm-copper-servers . This is an important announcement, since Dell is a big player in the markets best suited for ARM such as Internet web properties, Big Data, Cloud Service Providers, and HPC outfits. As such, this further validates the market demand for ARM-based servers, and helps accelerate the growth of the required ecosystem.

The observant reader will notice that Dell is using the Marvell Armada XP SOC, because, as Dell pointed out, “the Marvell parts are already available in sufficient quantities”. But Dell positioned themselves as “agnostic”, and confirmed in interviews that they are also working with Calxeda.

As the ARM server market continues to take shape, the one constant you can expect is choice. Thats the beauty of the ARM business model! Congratulations Dell! Welcome to the party!

Open Source Software Packages for Initial Calxeda Shipments

We are often asked what open-source software packages are available for initial shipments of Calxeda-based servers.

Here’s the current list (changing frequently).  Let us know what else you need!

Ubuntu Server 12.04 LTS and Fedora v17+

Compilers/Languages

  • GCC/gFortran 4.6.2
  • PHP 5.3.8
  • Perl 5.14.2
  • Python 2.7.2, 3.2.2
  • Ruby 1.8.7, 1.9.3
  • Erlang r14

Debuggers/Profilers

  • GDB 7.4
  • GProf 2.13
  • OProfile 0.9.6

Java

  • Oracle JVM SEv7u4
  • OpenJDK 6b24

Applications

  • Apache 2.2.21
  • Tomcat 6.0.32
  • MySQL 5.5.17
  • PostgreSQL 9.1
  • Apache Cassandra 1.07+
  • Apache Hadoop 1.0.0+
  • Memcached v1.4.13+

HPC Related Packages

MPI

  • MPICH 1.2.7
  • OpenMPI 1.4.3
  • MPICH2 1.4.1
  • Open-MX 3.5

Checkpoint

  • DMTCP 1.2.1
  • Condor 7.2.4

Libraries

  • BLAS 1.2
  • FFTW 2.1.5
  • ScaLAPACK 1.8.0

Monitoring

  • Ganglia 3.1.7

 

The Little (ARM) Server That Could

Two weeks ago, Calxeda publicly demonstrated Ubuntu 12.04 on the EnergyCore SoC, a monumental occasion for the ARM server industry.  The progress that’s been made by Calxeda and our partners over the last 12 months has truly been remarkable.  The journey we’ve taken and the opportunity afforded us reminds me of a famous childhood story, “The Little Engine That Could”; a story that teaches children about hard work and believing in ourselves.

The Little Engine That Could by Watty Piper

(Spoiler alert: Essentially, there’s a stranded train that needs help getting over a high mountain. Some of the larger, more established, engines are asked to pull the train, but for various reasons they refuse. So they ask the small engine, who agrees to try. The engine successfully pulls the train over the mountain while repeating its motto: “I-think-I-can”.)

There have been naysayers who have, from the very beginning, doubted not only Calxeda’s ability, but the ability of an entire ecosystem to recognize and respond to an industry desperate for change.  And that’s exactly why the world’s first Ubuntu 12.04 demo on an ARM server two weeks ago was so exciting!  Together with our partners, we demonstrated the following on a Calxeda reference server:

  1. Fully functional web server powering a local copy of calxeda.com
  2. Cloud Infrastructure-as-a-Service (IaaS) platform via OpenStack
  3. Support for Canonical’s Juju and MaaS for system configuration and provisioning

Some people have recently asked me, “so, what’s the big deal?”  Well, I want to take a moment to provide some color commentary about these demos and, more importantly, what these demos really represent. [Read more...]

What is an SoC? Hint: the “S” stands for Server.

The acronym “SoC” generally refers to “System on a Chip”. But with SoCs entering the server space, it is also taking on a new meaning: “Server on a Chip”. An SoC is a large scale integration of processor cores, memory controllers, on-chip and off-chip memories, peripheral controllers, accelerators, and custom IP (intellectual property) for specific applications and uses. As Moore’s law continues, chip process geometries shrink, allowing more transistors to reside on the same area of silicon. Traditionally, server processors have used this new real estate to add more cores. But there are better alternatives than just adding more cores for certain applications.

Increasing integration in an SoC brings a number of benefits including:

  • Higher performance – significantly faster and wider internal busses compared to those found in a multi-chip or multi-board solution.
  • Lower power – wider range of power optimization techniques can be employed in SoCs including power gating, changing bus speeds depending upon utilization, dynamic voltage and frequency scaling of processor cores and peripherals, multiple power domains, and a number of others. Additionally, having peripherals on chip avoids power hungry PHYs (analog drivers that need to drive signals between chips and boards).
  • Higher density – fewer components to buy, consume power, and fail.
  • Deeper integration of peripheral controllers and fabric interconnect technologies allow a number of advantages that cannot normally be achieved by having to go through standard bridges like PCIe.

Let’s stop and consider the components we typically will find in a standard rack-optimized volume server:

  • One or two processor chips, often with integrated memory controllers.
  • One or two chips for processor chipsets providing a range of functions like Southbridge peripherals and PCIe.
  • A PCIe connected Ethernet NIC, either chip or PCIe board. In today’s volume servers, this is typically one or two 1 Gb Ethernet interfaces.
  • A PCIe connected SATA controller, either chip or PCIe board.
  • Controller chip for an SD card and/or USB.
  • An extra cost, optional BMC (baseboard management controller) providing out of band system management control.

So, now with the availability of a purpose-built ARM® server SoC, how does this change? Everything in the laundry list above gets integrated onto a single, low power die. For example, let’s take a look at the Calxeda EnergyCore ECX-1000 series of SoCs. In each chip, we find:

  • A quad-core Cortex A9 CPU, configured for server workloads.
  • The largest L2 cache that you’ll find on an ARM server: 4 MB with ECC.
  • A server class memory subsystem including a wide, high-performance 72-bit DDR3/3L memory controller, also including ECC.
  • Integrated peripheral controllers that have direct DMA interfaces to the internal SoC busses without the PCIe overhead. Standard server peripheral controllers like multiple-lanes of SATA, multiple Ethernet controllers (both 1 Gb and 10 Gb), even an SD/eMMC controller for local boot or scratchpad storage,  are all integrated on-chip.
  • If your server needs to connect to devices that are not integrated, there are four dual-mode PCIe controllers, supporting both root-complex and target modes, in both x4 and x8 configurations.
  • Instead of an optional (and expensive) BMC, management is built onto every chip, providing a sophisticated server management system that provides both in-band and out-of-band IPMI/DCMI system management interfaces along with dynamic power and fabric management.
  • A deeply integrated, power and performance-optimized fabric interconnect, which we’ll talk about in a future blog entry.
  • And all of this is designed with performance, power, and cost optimized servers in mind, delivering the industry leading performance/Watt and performance/Watt/$ servers.
Calxeda EnergyCore ECX-1000 Block Diagram
Calxeda EnergyCore ECX-1000 Block Diagram

 

With all the typical server components integrated onto a single chip, you can build a server by “just adding power and DRAM”. And even that is made easy for our customers with a card-level reference design of four EnergyCore SoCs, power regulators, DRAM, and fabric interconnect.

For the last several years, SoCs have been used in embedded systems and mobile devices for the same reasons and benefits discussed above.  The server industry is now applying those same lessons learned to it’s own domain.  No matter what the design looks like, a better integrated and power optimized Server-on-a-Chip is needed for the scale-out, cluster demands of our Internet generation.

A note about fruit

When comparing fruit,  everyone knows not to compare apples to, say, an orange or, god forbid,  a cumquat.  The same applies to chips.  See this nice article, then come back and read on…

http://www.marketwatch.com/story/dell-to-unveil-new-lower-power-intel-run-servers-2012-05-08

Nice job, DELL.  Ditto Intel!  Now, you might think, “oh wow! A 20 watt Intel Server! ARM’s lead certainly didn’t last long; Calxeda is toast! ”  A sub-20 watt Xeon is indeed an accomplishment;  Intel is a great company and knows what they are doing.  But be careful when comparing our 3.8 (ok, call it 4) watt ECX-1000 to a Xeon.  On the surface, we consume 1/5th the power.  Not bad!  But the story runs deeper than that. Let’s dissect the fruit and see what’s inside.

Xeon is not an SoC (more on that in another blog).  It is a multi-core processor, like the Cortex A9 from ARM.  It does have some integrated I/O (PCI-E 3.0 to be precise).  But it does not have Ethernet, much less five 10Gigabit Ethernet ports.  It does not have SATA controllers. It does not have an integrated BMC for processor management, much less fabric management and power optimization.  All of these need to be added as additional components in the system BOM cost and power envelope to offer equivalent and necessary functionality to a Calxeda ECX-1000. Xeon does have more performance per thread; probably 3-5X, in fact, depending on the workload.  But remember that ARM processors for servers are NOT about performance.  If you need performance, buy Intel, or AMD, or IBM Power.  But, it doesn’t matter how fast your thread or core can run if you are spending 90% of your time waiting for I/O.  And that is exactly the problem people have with traditional architectures today in dealing with data-intensive computing  such as Hadoop.

What really matters is the total power and cost of a CLUSTER for a particular workload.  Not a processor, or even an SoC.  A cluster of Calxeda server nodes will consume only 5 watts each, complete with DRAM memory. At 100%.   At idle it only consumes .5 watts. (Oh, yeah, don’t forget about memory which can consume as much as 1Watt per Gigabyte in traditional servers!)

So, always be sure to check your fruit carefully!

ARM Arrives – Calxeda Shows Real Hardware

Richard Fichera, of Forrester Research, was one of the 1st to see the potential of ARM in the datacenter.  He takes note of today’s milestone:

http://blogs.forrester.com/richard_fichera/12-05-07-arm_arrives_calxeda_shows_real_hardware_running_linux

Calxeda demonstrates Ubuntu 12.04 LTS on EnergyCore SoC

This week, Calxeda is showing a live Calxeda cluster running Ubuntu 12.04 LTS on real EnergyCore hardware at the Ubuntu Developer and Cloud Summit events in Oakland, CA. This is not an FPGA demo. This is the real deal on real silicon; quad-core, w/ 4MB cache, secure management engine, and Calxeda’s fabric, all up and running.

On stage at UDS

Larry Wikelius, Co-founder of Calxeda, on stage with Mark Shuttleworth at UDS.

Calxeda’s “Greenbox” (get it?) prototype supports up to 48 quad core SOCs in a 2U package

Ubuntu 12.04, with support from Canonical, is the 1st Linux distribution with full support for ARM as a 1st tier server architecture. Incorporating OpenStack’s cloud management infrastructure, Ubuntu 12.04 is designed to support the world’s largest cloud environments, where Ubuntu enjoys commanding market share today.

After months of discussion, debate, claims, and counterclaims, the industry can now begin a fact-based dialog about Calxeda-based servers. What applications are appropriate? Are they fast enough? How much can they really save large internet and IT shops? Do they really consume only 5 watts each? In other words, this new category of technology is moving beyond Powerpoints and on to proof-points. Ok, we will still pepper the market with pretty presentations, but at least they will contain real benchmarks and measurements made on real systems. We will begin communicating benchmark results on calxeda.com soon.

So, back to Oakland…Running Ubuntu 12.04, we are demonstrating a standard LAMP stack (running Calxeda’s website) along with other popular web frameworks such as node.js and Ruby on Rails, provisioning of OpenStack Nova compute instances, and even Canonical’s Metal-as-a-Service bare-metal provisioning. The cluster we are running is a Calxeda EnergyCard prototype in a 2U chassis that supports up to 48 quad-core nodes at under 300 watts, with up to 24 SATA drives. For more information about UDS, please see http://uds.ubuntu.com/. Remote Participation for UDS is available at http://uds.ubuntu.com/community/remote-participation/.

While exciting to see, this demo really shows just how easy it is to move modern software over to Calxeda and Ubuntu. Literally, it all just worked. The code came up without any modifications. Just load and go.

The Linux community will see immediate benefits from such a server for building Linux kernels and distributions. A complete build of the Ubuntu 12.04 kernel took less than an hour to compile on a single node, 1/4 the time of current ARM build platforms. With a larger Calxeda cluster, a full build of the entire distro will take hours, instead of weeks.

Now that Calxeda EnergyCore has been seen in the wild, you can expect more sightings at a variety of industry events, and end-users shipments will begin over the next 4-8 weeks. Volume shipments are expected to begin early this Fall from HP and other system vendors. Be sure to check our website frequently to get updates.

Who said that hardware is boring? Let the fun, and games, begin!

Hello World!

Welcome to ARM Servers, Now!, a blog dedicated to bringing industry perspectives on ARM-based servers, data centers, and the technologies and partners bridging those two worlds.  Our mission is to provide weekly news about the transformations data centers are taking to be more energy efficient while achieving the computational throughput needed by today’s scale-out, hyperscale, and cloud computing infrastructures.

We encourage you to join us in the conversation as we look to change the world together!  If you are interested in contributing to this blog, please feel free to leave us a comment.

Follow

Get every new post delivered to your Inbox.

Join 980 other followers