Hello, I’m currently thinking back and forth about which new home server to build. What I’ve stumbled across: the i9 and new Core 9 Ultra all only support a maximum of 192GB RAM. However, some of the mainboards support 256GB (with 4 RAM banks and dual channel). Why?

I want to have the option of maxing out the RAM later.

I could buy 4x48GB RAM now and be at 192GB. Maybe I would be annoyed later that 48 GB of RAM is still “missing”. But what if I buy 4x 64GB RAM? 3x64 GB RAM makes no sense, because then dual channel is not used. 4x64 is probably not recognized by the processor?

Or are there LGA1851 or LGA1700 processors, capable of handling 256GB RAM?

  • battlesheep@lemmy.worldOP
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    5 days ago

    My edge case is: I wanna spin up an ai-lxc in proxmox. ollama and open webui. using RAM instead of vram. but it should low on power consumption on idle. thats why I want an intel i-9 oder core ultra 9 with maxed out RAM. it idles on low power, but can run bigger ai-models using RAM instead of VRAM. it would be not so fast like with GPUs, but thats OK.
    I think a xeon would need more power…much more power in idle. I have an old Xeon E3-1275 v5, 32 GB RAM with a supermicro D3417-B mainboard and it idles about 10 Watts. this is fantastic, I but I don’t think I can get a good newer Xeon with low consumption like this. but I wanna send the old lady to retirement.

    • hendrik@palaver.p3x.de
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      5 days ago

      AI inference is memory-bound. So, memory bus width is the main bottleneck. I also do AI on an (old) CPU, but the CPU itself is mainly idle and waiting for the memory. I’d say it’ll likely be very slow, like waiting 10 minutes for a longer answer. I believe all the AI people use Apple silicon because of the unified memory and it’s bus width. Or some CPU with multiple memory channels. The CPU speed doesn’t really matter, you could choose a way slower one, because the actual multiplications aren’t what slows it down. But you seem to be doing the opposite, get a very fast processor with just 2 memory channels.

      • battlesheep@lemmy.worldOP
        link
        fedilink
        English
        arrow-up
        0
        ·
        5 days ago

        The i9-10900 has 4 channels (Quadro-Channel DDR4-2933 (PC4-23466, 93.9GB/​s). would this be better in this way than an i9-14xxx (Dual-Channel DDR5-5600 (PC5-44800, 89.6GB/​s))?

        does the numbers (93 GB/s and 89GB/s) mean the speed for a RAM-stick or the speed all together? maybe an old i9-10xxx with 4channel-ram was better than a new dual-channel.

        • hendrik@palaver.p3x.de
          link
          fedilink
          English
          arrow-up
          0
          ·
          edit-2
          5 days ago

          Well, the numbers I find on google are: a Nvidia 4090 can transfer 1008 GB/s. And a i9 does something like 90 GB/s. So you’d expect the CPU to be roughly 11 times slower than that GPU at fetching numbers from memory.

          I think if you double the amount of DDR channels for your CPU, and if that also meant your transfer rate would double to 180 GB/s, you’d be roughly 6 times slower than the 4090. I’m not sure if it works exactly like that. But I’d guess so. And a larger model also means more numbers to transfer. So if you now also use your larger memory to use a 70B parameter model instead of an 12B parameter model (or whatever fits on a GPU), your tokens will come in at a 65th of the speed.