Calxeda lives! (Well, at least the Fabric does!)

Last week, in conjunction with the AMD announcement that they are (finally!) shipping the Opteron A1100 series ARM Cortex A57,  a little-known company debuted 2 fabric interconnect solutions for Hyperscale data centers. As some of you may recall, AtGames Holdings purchase the Intellectual Property from Calxeda’s bankers a little over a year ago.  AtGames was lined up to be Calxeda’s largest initial customer, and were left in a jam when Calxeda suddenly folded. The SilverLining Systems subsidiary announced that they had taken the Calxeda technology and have repackaged and reimagined it to produce two new products aimed at the Rack-scale Fabric market. Given all the excitement this fabric generated, this was perhaps more newsworthy than the expected AMD announcement,  so I wanted to chime in with a few thoughts

Calxeda

The first product is a PCIe card with one Calxeda ECX-2000 ARM SOC,  supplying 4 XAUI ports  to interconnect other servers on eh fabric,  and one 10GbE SFP for uplinks to the Top Of Rack Switch.  It also supports up to 8 GB of optional DRAM to enable the four ARM Cortex A15 processors (normally turned off when the card is acting only as a fabric adapter) to execute jobs such as packet inspection and other offload tasks.   The second product, which will be available in Mid 2016, is an ASIC for custom server developers.  The Fabric Interconnect Chip (FIC) is basically the ECX-2000 without the cache and the A15 cores, providing the 80 GbE fabric switch, the PHY’s, and the ARM Cortex A7 to manage the routing tables and optimize traffic flow across the fabric.  These products will reduce in-rack networking costs by around 75%, according to Silver Lining Systems.

With this announcement, the promise of the Calxeda Fabric is decoupled from the ARM world and can now be used to interconnect standard x86 servers, or build custom dense servers with virtually any processor or SOC.  TPM has already covered the products in this article, but for those who knew and loved Calxeda,  I’d like to add a few comments.   (Full disclosure:  I have been recently consulting with Silver Lining Systems.)

Newport

The SLS Newport fabric interconnect adapter (FIA)

There is a lesson here for all you guys trying to provide innovative new technologies. One that is so obvious it can easily be overlooked in the zest for New and Improved!.   Your new product must provide unique value but it MUST BE EASILY CONSUMED.   Calxeda changed too many things at once:  “Here’s a new processor  with a new (for servers) instruction set, a new (fledgling) ecosystem, in a new form factor,  and with a radically different networking topology and management approach.”  “WHOA!” the customers all said.  “One change at at time, please!”

Now, had we simply focussed on ARM, we might have been more successful, but without a differentiator we wouldn’t have lasted long.  Had we just been a Fabric company, going up against Mellanox and CISCO and Intel and …,  we never would have gotten the funding. BUT, had we produced two dies, the Fabric customer could have kept his Intel processors, as SLS has done now, or just used the ARM SOC  as AMD has done,  or they could combine the two.  But we didn’t offer them that choice.  We realized the problem, but it was too late to fix it and we ran out of cash when our investors ran out of patience.

So, bottom line?  Change as little as possible and still produce value through innovation. OR produce a plug and play complete solution like an iPhone that fits within the larger infrastructure the customer is already comfortable with.

Three New Year Resolutions for ARM Server & SOC Vendors

As we enter 2016, its probably a good time to reflect on where the ARM Server movement stands.  There are now two vendors with production V8 parts in the market, Cavium and Applied Micro, and more are on the way from AMD, Huawei, Qualcomm, and others.   So, the future looks bright, with lots of promises from major vendors.  On the negative side, there remains no mainstream ARM-based servers in the market,  and zero production use cases we can examine.  Even more concerning, there still aren’t any published benchmarks of note. Let’s examine what needs to be done to restore some luster and credibility to this powerful vision of an alternative architectural standard that can compete with Intel Xeon for the server market.

2015 ARM TechCon and SC’15 recently provided fresh opportunities for the ARM Server community to talk up the latest chips and roadmap announcements.  The surprising news was: there was no new news!  Sure, AMCC announced intentions to produce a 3rd generation SOC.  Thats not news; of course they will do a 16nm part with lots of cores and goodies after they get their 28nm part out.  And Gigabyte announced a Cavium-based server design,  a year after Cavium supposedly was ready for prime time. But where were the use cases and success stories?

A case in point: During a well-attended HPC session sponsored by ARM Holdings at SC’15,  Dr. Olof Barring of CERN, lamented that the reality has fallen short of the claims made by the SOC vendors.  CERN, one of the world’s leading technology institutions and champions of power-efficient computing and storage,  has been unable to  acquire 64-bit ARM servers that vendors claim are supposedly “in production”.  Perhaps worse,  the 64-bit prototypes he has been able to get his hands on did not demonstrate performance per watt  advantages touted by their vendors. “Neither ARM nor POWER 8 has delivered the performance per watt we see today with our Intel servers”, he said.

The harsh reality is that the current (initial) batch of ARM V8  SOCs are still seeking their niches in the market,  and it will take time before we see competitive ARM SOCS for general purpose server workloads.   So, the perennial battle cry has been “ARM Servers are coming!”, and unfortunately this will remain the case for some time to come.  (See Timothy Prickett Morgan’s excellent article on why we are still waiting.) But eventually, ARM will succeed with advances in core microarchitectures, and by narrowing the gap afforded by Intel’s Fabulous Fabs.  With this in mind,  here’s my list of suggested New Year’s resolutions for the industry’s players to consider:

New Years Resolution #1:  Be Patient. The effort to bring forward a competitive SOC and software ecosystem will take years to materialize. We will need faster cores, advanced process nodes like 14nm FINFET,  and lots of work on the optimized software stack such as those begun by RedHat, ARMH, and Canonical..

New Years Resolution #2:  Tell the truth.   ARM Server SOC vendors have been fairly undisciplined with in communicating the facts in terms of  schedule, performance, and power consumption, resulting in the perception of “late”, “slow” chips that are not as power efficient as their Intel competition when it comes to number crunching. Vendors should set realistic expectations for schedules, performance, and power consumption and be very explicit about their SOC’s applicability for specific workloads.

New Years Resolution #3:  Be transparent.  A corollary of #2.  Your customers are very smart; they can handle the truth.  And they tire of hearing claims of superior performance per watt without any reproducible benchmarks or 3rd party measurements. So, give your gear to folks like Anandtech and let them measure the efficiency with real-world workloads. Yes, synthetic benchmarks will always give Intel an advantage, and they aren’t relevant to the sort of workloads you are targeting.  So tell us something that *is* relevant to your markets.

Have I grown skeptical of ARM’s potential in Servers?  No, I remain enthusiastic and optimistic about the future.  Lots of hard work remains.  Be vigilant, take the high road,  and trust that your customers are smart enough to tell the difference between facts and BS.

Note: the opinions expressed in this blog are solely the views of the author.