h a l f b a k e r yClearly this is a metaphor for something.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
According to Wikipedia, current DRAM transfer rates can
reach up to 204.8 Gbit/s. This is roughly 200 times
faster than gigabit Ethernet, or in technical terms,
really, really fast.
What if we designed a DIMM that didn't contain any
memory modules, but instead was designed to plug into
TWO
motherboards? It would look sort of like a taller
DIMM, but with contacts on both edges. A special case
could be designed where one motherboard would be
flipped over and rotated 180º so that the DIMM slots line
up. Both motherboards would have a CPU, RAM (in other
DIMM slots), and (potentially) a full complement of
cards filling the PCIe slots.
One system (the slave system) would be loaded with a
special operating system designed only to shunt
requests coming in over the memory bus to and from the
various hardware devices. The other system (the
master system) would have a normal operating system
installed, with a special driver that maps a virtual
portion of RAM to control hardware requests to the slave
system.
Communications between the two systems would be so
fast that they could appear to the user as if they were
a single system with double the RAM and CPU power.
Dual port RAM
http://en.wikipedia.../wiki/Dual_port_RAM Baked. [8th of 7, Nov 12 2013]
[link]
|
|
Not sure I see what the problem is in terms of cooling.
The two motherboards would only overlap slightly, so you would have
the normal amount of space
for each motherboard to install your fans, water
coolers, liquid nitrogen baths, or what have you. |
|
|
The DIMM itself wouldn't be difficult to cool either,
since it would stick out much farther than adjacent
DIMMs, and (if necessary) could have its own cooler
installed that didn't touch anything else. |
|
|
Baked to a crisp
dual-port RAM and
memory bus sharing have been around since
the 1960's. |
|
|
[suggested-for-deletion], Widely Known To
Exist. |
|
|
Okay, but what does that have to do with this? |
|
|
Dual-ported RAM allows simultaneous memory reads
and writes. This idea proposes converting a memory
slot into a high-bandwidth networking interface, by
means of a special DIMM (that doesn't even contain any
usable RAM, dual-ported or otherwise). |
|
|
//Its all SATA or equivalent under the hood.// |
|
|
With a substantially slower bus like SATA, or even
10Gbit Ethernet, you're not going to be able to have
the two computers communicate at a low levele.g.
rendering graphics for one system on the other
system in real time, or offloading CPU calls from one
system to another for tasks that require high speed
and low latency (real time video decoding and
effects, for example). |
|
|
A cluster that uses a slower transport medium is
going to be limited by the speed of that medium.
There's pretty much nothing faster than DRAM access,
so it would be the best medium for linking two
systems so that they appear to the user as one. |
|
|
//a concrete application// |
|
|
//Each pin on a DIMM is about as fast as SATA or
Ethernet. Its just that there are more pins.// |
|
|
So? The interface is still very fast overall, and that's
the point. Every single consumer (or professional)
motherboard out there is already equipped with the
network interface to make this work, all you need is
a simple adapter with some basic circuitry and a
software driver. No need for //High end costly
topologies//. |
|
|
Oh, and a specially designed computer case, you'll
need that too. But for not much more (or perhaps
even slightly less, since you could eliminate some
redundant components like graphics cards and such)
than the cost of two little computers, you could have
one big computer, which is better from the
standpoint of efficiently maximizing resource usage
anyway. |
|
|
I like this for high availability applications [+]. With
HA, the cross connect speed is always the bottle
neck. This would help a lot. |
|
|
//I think the cheapest solution for you is probably a multiple CPU motherboard. Each
CPU has access to the full memory bus.// |
|
|
How is that cheaper? You pay a premium for a dual core motherboard, then on top of
that you pay a hefty premium for the processors, because a dual core motherboard only
works with Xeons. You can get a 3.4GHz 6-core i7 processor for under $600, but a
2.66GHz 6-core Xeon will run you upwards of a grand. So you're paying an $800 premium
just to be able to run dual processors. A second motherboard would only run you a
couple hundred, and you can get RAM for that motherboard for a couple hundred more.
You end up saving several hundred bucks, and are able to run at a much higher clock
speed to boot. |
|
|
As a matter of fact, I already own a dual-CPU system (a Mac Pro), and I paid quite a bit
for the privilege of having 12 cores and 32GB of RAM readily available. It's still not as
fast as I'd like (and yes, I'm utilizing all of the power of my systemI mainly use it for
tasks that are easily parallelized). Heck, if I could join two of these systems and end up
with 24 cores, I'd seriously consider it. I've tried networking multiple systems, but it ends
up being a hassle and unreliable, and the slowest system on the network tends to bog
things down disproportionately. A simple way to meld two systems into one would be
incredibly useful. |
|
|
Indeed. And I also have my system set up to use spare
CPU cycles to help crunch numbers for disease
research and such. So I'm not going to come out and
*say* that by opposing this idea you're in favor of
letting kids die of cancer, but
|
|
|
Yes, I have. The problem is the GPU is only really usable for very specific
purposes, so in order to take advantage of it your application has to be
specifically tailored to use it. It's thus very difficult to provide a flexible
general rendering interface for the graphics hardware. If you switch to a
different video codec, or apply a different kind of image filter, your super-
fast GPU code becomes useless. |
|
|
A handful of graphics/video applications take advantage of the GPU for
rendering, but even the ones that do often only use the GPU for certain
limited functionality, or yield substantially reduced quality with hardware
rendering versus software. There's just no substitute for raw CPU power in
most cases. |
|
|
So you're looking to make a blazingly fast connection between two ordinary/cheap motherboards. |
|
|
Just so you know, you'll pretty much HAVE to have a dual port memory between them to use this interface effectively. With no memory between, you'd need to arrange some way to have one computer doing a read exactly at teh same time teh other is doing a write. And both motherboards are gonig to be trying to drive the clock. |
|
|
Of course you probably don't need a really large dual port RAM to make a fairly efficient protocol for rapidly dumping large blocks of data across. |
|
|
Yes, of course you'd need some sort of buffer or controller
to get the machines to talk. That was what I meant by
basic circuitry in a previous anno. It might include a
small
amount of RAM (though such memory would not be directly
accessible to either system). But in what way does that
make this idea itself similar to the concept of dual-ported
RAM? |
|
|
So far I've gotten two MFDs (well, an MFD and an SFD),
neither of which I feel is justified. The idea is clearly not
baked, as no such thing exists, even if some of the
components to make it exist (and why is that not a /good/
thing?). As for bad science, that doesn't even make sense. I
haven't seen a single argument that this idea isn't possible
to implement, just that it wouldn't be terribly useful or
practical, or the most efficient way to combine multiple
systems. So fine, you think it doesn't solve the problem it
sets out tothat just makes it a bad idea, in your opinion.
But in what way is my science flawed? |
|
|
Feel free to criticize the idea based on its merits, but I
submit that the idea is neither baked nor unbakeable, and
thus both MFDs raised thus far are inappropriate. |
|
|
[+] by the way. I also don't see any reason for MFD. Even if this is simply implemented as a dual port RAM, I'm not aware of any dual-port RAM with a dual DIMM interface. I'm actually not even sure if there is a dual port RAM readily available that can handle two different clock domains: a necesity since you can't synchronize the system clocks on two standard cheap motherboards. Any traditional clock re-sychronization sceme is going to introduce a cycle of latency, slowing down the interface (assuming you can even make that work on an DDRx SDRAM interface). |
|
|
There are "asynchronous Dual-Port RAMs", but those have two asynchronous RAM interfaces. Since most modern motherboards use DDRx Sychronous DRAM, you'd need a dual-port RAM with two synchronous interfaces that can operate asynchronously. |
|
|
If you implement this as something more complex than a shared memory space, you'll basically be emulating the DDRx SDRAM interface, which might allow using less exotic memory for your buffer, but might be difficult to design. You might also consider a ribbon cable to a PCI card on each motherboard to allow triggering interrupts when a packet is ready to read because I don't think the DIMM slot is set up for the RAM to send an interrupt to the CPU. |
|
|
Ignoring for the moment the idea of putting two 8255's back to back (been there, done that), you're going to hit huge issues with physical geometry. |
|
|
Mobos generally have a dirty great heatsink in the CPU. That limits orientation. It's not clear how close you can get the DIMM sockets on the boards, and extending the memory bus lines more than a fraction will introduce huge timing problems (extra capacitance) and probably stop the interface working at all. |
|
|
Dell blade servers in a backplane would do much better, and they're off the shelf. |
|
|
I think you're arguing that this will be fast and cheap, because memory is fast and cheap.
However, your proposal is to bring in essentially a complete computer to manage the peripherals. This is obviously a much more complex entity. As described it seems to be essentially full PC, slaved to the master PC.
I struggle to see how this would be a fast, cheap (or reliable) way of talking with peripherals compared to a well-designed interface. |
|
|
I think there are two ways to read the idea. |
|
|
Firstly you could see it as a way of getting a more out of consumer-grade or 'off-the-shelf' stuff. Essentially, getting stuff that works with windows/PC drivers to transparently slot in and work.
It may possibly help by taking away the overhead of distractions like interrupts and polling loops from the master system, in a backwards-compatible manner. However, the information has to get there eventually, and you've introduced a latency, along with considerably more ways for it all to go horribly wrong.
What it wouldn't do is transparently double the RAM and CPU power. It wouldn't do anything for those at all without some nasty hacks to the master's OS, if you were trying to get it to run legacy stuff.
It may be that this would allow you to hang more stuff off a single PC. But only twice as much of anything at most. If it were a method which allowed an arbitrary amount of stuff to be attached then it would be more interesting. |
|
|
Secondly you could be proposing a method of parallel computing in a new system designed from scratch. In which case it seems pretty inelegant and unnecessarily restricted. |
|
| |