AMD Instinct MI250X with MCM GPU to feature 110 Compute Units, 128GB HBM2e memory, and 500W TDP –

1 : Anonymous2021/10/24 11:05 ID: qeq0yy
AMD Instinct MI250X with MCM GPU to feature 110 Compute Units, 128GB HBM2e memory, and 500W TDP -
2 : Anonymous2021/10/24 13:50 ID: hhv0g02

Almost 50tflops in double precision, that's crazy!

3 : Anonymous2021/10/24 13:46 ID: hhuzyhx

now i really hope ROCm finally gets better.... those cards sound wonderful for the AI software i develop.

ID: hhz3t1i

Agreed, though for OpenCL workloads, we might get that working properly via Mesa first.

4 : Anonymous2021/10/24 12:37 ID: hhusks5

I assume this isn't meant for gaming

ID: hhuuxsv


ID: hhvej7d

I assume LTT is trying to game on it anyways

ID: hhuvuhz

If you don't mind me asking, what is it meant for?

ID: hhvnyd9

But could it be forced to run games

ID: hhvpvwo

Not with that attitude

ID: hhwqwtt

Can it run Crysis?

ID: hhws3ml

Yeah it can /s

5 : Anonymous2021/10/24 11:18 ID: hhulk9u

220 CUs would fit the performance figures perfectly. (Assuming that performance in FP64 is half the figure mentioned.)

(220 CUs make a lot more sense considering that the MI100 has 120 CUs.)

ID: hhwhk1m

Not to mention the power figure of 500W. You can't get to 500W at 1.7GHz with just 110 CU's.

ID: hhyqvqh

Just goes to show how much improvement AMD's made to their physical design through these 3 uArchs on 7nm. Vega 20's double precision output was 7.373TFLOPs at 300W. MI250X is just under 7x that at 2/3rds extra power.

ID: hhwvssa

This CDNA2.0 This is the first from the ground design server chip. CDNA1 was a CGN5.0 evolution. You don't know the configuration of a CU. So that means A CDNA 2 CU could be way more powerful then a CDNA 1 CU.

ID: hhylhkn

From the other figures, the only other way would be for one CDNA 2 CU to be exactly twice as powerful as one CDNA 1 CU. This would seem somewhat strange. It would also imply 55 CUs per chiplet, which again is a strange number.

It's not completely out of the question that AMD changed what a CU means, but I lean towards mistrusting the rumour.

6 : Anonymous2021/10/24 11:16 ID: hhulfk8

Cool beans

ID: hhv4i2w

500W. Burned beans.

ID: hhv67bq

Just need more beans then.

ID: hhvg77p

AKA coffee?

7 : Anonymous2021/10/24 12:33 ID: hhus4py

I wonder if CDNA2 is still based on GCN, like CDNA is.

ID: hhuu79h

It is.

ID: hhw3jjs


ID: hhxwh88

Why are they still using GCN? That's weird.

ID: hhw99uc

I wonder if CDNA2 is still based on GCN, like CDNA is.

CDNA 2 is based on CDNA 1, which is based on GCN 5, which...

8 : Anonymous2021/10/24 18:31 ID: hhw3ycp

Will it msrp?

9 : Anonymous2021/10/24 11:48 ID: hhunz33

What’s the eth hash rate lol

ID: hhvdxb4

Napkin math:

128GB HBM2e - divides into eight 16GB stacks. JEDEC standard for HBM2e is 307GB/s/stack, Samsung HBM2e is 410GB/s/stack, SKHynix HBM2e is 460GB/s/stack. So memory bandwidth (the limiting factor, generally, in Ethereum mining) is somewhere between 2456GB/s and 3680GB/s.

307GB/s/stack * 8 stacks = 2456GB/s
460GB/s/stack * 8 stacks = 3680GB/s

Let's use a Radeon VII as the other reference point - it's a GCN based chip (as these likely are) with HBM and has 1024GB/s memory bandwidth, with a TDP of 295W. It's also a fairly popular mining card still, and clocks in at around 93MH/s at 200W according to whattomine.

For JEDEC standard HBM2e

2456 GB/s / 1024 GB/s = X MH/s / 93 MH/s # GB/s cancel each other out, multiply through by 93 MH/s 2456 * 93 MH/s / 1024 = X MH/s 223 MH/s = X MH/s

For Samsung HBM2e

3280 GB/s / 1024 GB/s = X MH/s / 93 MH/s # GB/s cancel each other out, multiply through by 93 MH/s 3280 * 93 MH/s / 1024 = X MH/s 297 MH/s = X MH/s

For SKHynix HBM2e

3680 GB/s / 1024 GB/s = X MH/s / 93 MH/s # GB/s cancel each other out, multiply through by 93 MH/s 3680 * 93 MH/s / 1024 = X MH/s 334 MH/s = X MH/s Power. Radeon VII is 295W but mines at around 200W. Assuming the same scaling: 500W TDP -> 338W. Let's just say 350W to be more conservative.

TL;DR: Somewhere between 223MH/s and 334MH/s at around 350W, easily making it one of the most efficient mining cards out there. According to whattomine that's about $16 - $24 in pre-tax profit per day.

ID: hhvkvik

Would only take two years to break even at that rate...

ID: hhus3mm

Doesn't matter, MSRP will be $2000 and you won't be able to find it under $5000. /s

EDIT: Clearly I wasn't trying to name actual prices, I just picked a number out of my ass that sounded expensive and then multiplied it. Get over it.

ID: hhuuik5

Lol. This is Instinct MI. It's MSRP will be closer to $10 000

ID: hhv4zx3

You'll be lucky to get it under 10k lol

ID: hhuu2f2

It would be a steal for both of these prices

10 : Anonymous2021/10/24 11:49 ID: hhuo3nt

I wonder how much fps would it get in Crysis at 15360x8640 resolution.

ID: hhv51pm

Literally 0

ID: hhv728v

It has no video out

ID: hhv35ut


ID: hhvdefi

Theoretically something even with the lack of video out. There's a setting in Windows you can turn on if you have a CPU with integrated graphics that lets you use the integrated for light tasks like browsing and the gpu for heavy tasks like gaming

11 : Anonymous2021/10/24 14:18 ID: hhv3t6l

With that kind of TDP it’s definitely going to hog at least 4-5 expansion slots

ID: hhv65dn

Probably water-cooled in server racks would be my guess that's a lot of heat to remove from the area.

ID: hhvq14p

Nah, either water cooled or just LOTS of airflow from some good ol deltas

ID: hhvq6w9

If it’s using deltas, I figure will need a massive heatsink. I’m just going off the trend I’ve noticed with GPU heatsinks getting larger every couple of years

ID: hhwrkm0

It doesn't use a standard PCIe card. It's based around the OCP Acceleration Module.

12 : Anonymous2021/10/24 19:35 ID: hhwdh3b

I'd buy it

13 : Anonymous2021/10/24 21:44 ID: hhwvzqp

what would somthing like this be used for, is this amds version of quadro from nvidia

ID: hhy6tep

An equivalent to Nvidia’s Tesla line.

14 : Anonymous2021/10/24 23:09 ID: hhx71eh

I wonder what AMD's doing about their Solution Stack for Reinforcement Learning. No use releasing a powerful GPU without an accompanying software for writing codes.

15 : Anonymous2021/10/25 04:50 ID: hhy986y

Why does it have the same numbers for single precision and double precision? Isn't single usually faster than double?

17 : Anonymous2021/10/25 14:54 ID: hhzpj44

Missing something key here... why are their instinct line so incredibly expensive compared to Nvidia?

Wouldn't it make sense to push the product at a loss to increase adoption amongst data scientists and ML/AI engineers?

I badly want an MI50 but gd... for that price I could get two PNY Nvidia A16s...

18 : Anonymous2021/10/24 16:27 ID: hhvlcty

MI? Weird that it has the same brand as some of the most untrusted Chinese smartphones. Does AMD explain this?

19 : Anonymous2021/10/24 17:04 ID: hhvqrg2

I'm not sure why you think Aircraft Manufacturing and Design would have anything to say about this. /s

20 : Anonymous2021/10/24 17:45 ID: hhvwwm0

Since they have nothing to do with that company, I would assume, no, AMD does not have an explanation for this.

21 : Anonymous2021/10/24 18:36 ID: hhw4o9d

You’re a moron


Notify of
Inline Feedbacks
View all comments
Would love your thoughts, please comment.x