KiloCore is the World’s First 1,000 Core Processor

A team of researchers from the University of California have designed a first of its kind, 1000 core processor which they call the KiloCore.

Bevan Baas, professor of electrical and computer engineering, who led the team said,

“To the best of our knowledge, it is the world’s first 1,000-processor chip and it is the highest clock-rate processor ever designed in a university.”

There have been attempts at making processors with a lot of cores, but none of them ever went past 300 cores according to an analysis by professor Baas’ team. Majority of those were made for research purposes and very few of them ever made it into the consumer market.

Features of the KiloCore

The processing chip was fabricated by IBM using 32nm CMOS technology. The professor claims that this processor is the most energy efficient processor as well, saying that it uses just 0.7 watts of power.

The way the processor handles its tasks is also much more efficient than regular processors because it divides each application into smaller applications. Each of the smaller apps are given to a single processor to handle, in a more flexible approach. It can execute instructions about 100 times more efficiently than a regular laptop processor.

Possible Use in the Future

The KiloCore would not enter mass production any time soon as they were made using a relatively older manufacturing process of 32nm, whereas most of the processors nowadays are made using 14nm technology.

We could possibly see it make its way into the mobile devices, where several small applications are being run at the same time, and using less power is also considered a major plus point. This can lead to long-lasting mobile gadgets.

Via Engadget

A techie, Overwatch and Street Fighter enthusiast, and Editor at ProPakistani.

      • That is because you are reading the story on propakistani which “summarised” the story on Engadget. Go to the source, bro.

        1.782 GHz

        Each processor issues one in-order instruction per cycle into its 7-stage pipeline from either its 128 x 40-bit local instruction memory or an independent memory module. None of the 72 supported instruction types are algorithm-specific. Processor data memories are implemented as two 128 x 16-bit banks to sustain a throughput of one instruction per cycle for common instructions

        Communication on-chip is accomplished by a highthroughput circuit-switched network and a complementary
        very-small-area packet-switched network (Fig. 3). The source-synchronous circuit-switched network supports communication between adjacent and distant processors, as resources allow, with each link supporting a maximum rate of 28.5 Gbps.

        Except for the 64 KB SRAMs inside the independent memory modules, all memories are built from clock-gated flip-flops with synthesized interfacing logic which greatly simplifies the physical design and likely lowers the minimum operating voltage. Each processor contains 575,000 transistors and occupies 239 µm by 232 µm

        KiloCore’s 1000 MIMD processors are arrayed in 32 columns and 31 rows with 8 processors and 768 KB inside 12 independent memories in a 32nd row

        KiloCore processors do not contain explicit caches and instead store data and instructions inside i) local memory, ii) an arbitrary number of nearby processors, iii) on-chip independent memory modules, or iv) off-chip memory.


    • You think a major university working in collaboration with IBM will make such claims without evidences? LOL AT YOU THINKING BRO

      • No my dear. Just asked for more details to understand architecture…. Bcoz this article looks incomplete without them

  • Lol, this is hilarious. Someone doesn’t know their microprocessor theory. You cannot just “divide an application into smaller applications”. Unless a program has 1000 actual processes for individual thread operations, all those cores offer zero advantage in real life. Cores, just like gigahertz, are not everything. I’m willing to bet any recent dual core processor from Intel will run circles around this.

    • Nothing ARM designs or produces with their collaborating companies can run circules around Intel. And yet, ARM chips are eating away at Intel’s lower end market.

      Also it is highly depending on the application. Have you looked inside your GPU lately? This design is not very different from that, only a LOT MORE POWER EFFICIENT.

      • Two entirely different use cases. Intel is king when it comes to performance, whereas Arm focuses on efficient design at the cost of performance, using the big.LITTLE architecture. But what does all that have anything to do with this?

        And no, this is nothing like a GPU. It’s designed to do arithmetic computations, not polygons. Also their claim of being the first 1000 core processor would be wrong in that case, since GPUs crossed that milestone long ago. Nvidia’s latest flagship contains 2560 Cuda cores.

        So I guess this is actually just a novelty invention, designed just for the sake of being called the first such processor, and I have a strong feeling this will be the last we hear of it.

        • GPUs are not general purpose, but this is. Integer computation not with standing, most older processors used to have separate chip for floating point, and they did fine. In fact, a close design you can see is the Thinking Machines CM1 with 65536 cores spread over (I think) 4096 boards. When was that, 1984? A long time ago.

          Also what is power consumption of your CUDA cores?

          Regardless, it is a milestone in design that is far above what others have achieved.

  • i found some details… here it is
    “KiloCore” chip and has a maximum computation rate of 1.78 trillion instructions per second and contains 621 million transistors.
    cores inside of “KiloCore” can be independently clocked to a maximum of 1.78GHz, and shut down independently when they’re not being used. The cores also transfer data directly between each other, rather than leaning on a shared cache of memory, which is the norm with today’s commercial processors. All told, “the 1,000 processors can execute 115 billion instructions per second while dissipating only 0.7 watts,” the team says, making the KiloCore 100 times more power-efficient than a laptop despite being built on old 32nm CMOS processor tech from IBM.

  • Ltd feature videos

    Watch more at LTD